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Abstract 

Module  systems  for  languages  with  complex  type  systems,  such  as  Standard  ML,  often  lack  the 
ability  to  express  mutually  recursive  type  and  function  dependencies  across  module  boundaries. 
Previous  work  by  Crary,  Harper  and  Puri  [5]  set  out  a  type-theoretic  foundation  for  recursive 
modules  in  the  context  of  a  phase-distinction  calculus  for  higher-order  modules.  Two  constructs 
were  introduced  for  encoding  recursive  modules:  a  fixed-point  module  and  a  reeursively  dependent 
signature.  Unfortunately,  the  implementations  of  both  constructs  involve  the  use  of  equi-recursive 
type  constructors  at  higher-order  kinds,  the  equivalence  of  which  is  not  known  to  be  decidable. 

In  this  paper,  we  show  that  the  practicality  of  recursive  modules  is  not  contingent  upon  that 
of  equi-recursive  constructors.  We  begin  with  the  theoretical  infrastructure  described  above  and 
study  precisely  how  equi-recursiveness  is  used  in  the  recursive  module  constructs,  resulting  in 
a  clarification  and  generalization  of  the  underlying  ideas.  We  then  examine  in  depth  how  the 
recursive  module  constructs  in  the  revised  type  system  can  serve  as  the  target  of  elaboration  for  a 
recursive  module  extension  to  Standard  ML. 
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1  Introduction 


Modular  programming  is  critical  to  the  development  of  large-scale  software  systems.  By  enabling 
programs  to  be  broken  into  relatively  independent  components  that  interact  with  one  another 
through  statically-defined  interfaces,  modularization  provides  a  way  for  multiple  programmers  to 
work  on  a  project  simultaneously  and  coherently.  The  power  of  a  module  system  lies  in  the  flexibility 
of  its  facility  for  expressing  dependencies  between  modular  components.  Some  languages  (such  as 
Java,  Modula-3  and  Ada95)  only  allow  inter-module  dependencies  to  be  hardwired.  Yet  frequently 
in  such  languages  those  dependencies  are  permitted  to  be  mutually  recursive.  The  module  system  of 
Standard  ML  [16]  is  much  more  expressive.  Its  functor  construct  encourages  generic  programming 
in  the  style  of  Modula’s  generics  or  C-|— |-’s  templates,  but  also  allows  a  modular  component  of  a 
large  program  to  be  written  and  compiled  separately  by  abstracting  the  component  over  its  import 
dependencies.  However,  Standard  ML  does  not  support  any  kind  of  mutually  recursive  type  or 
function  definitions  across  module  boundaries.  In  other  words,  dependencies  between  Standard 
ML  modules,  separately  compiled  or  not,  must  be  strictly  hierarchical. 

A  mechanism  for  writing  recursive  modules  is  one  of  the  most  requested  extensions  for  Stan¬ 
dard  ML.  One  obvious  reason  is  that  the  need  to  write  mutually  recursive  datatypes  and  functions 
in  inseparable  bundles  inhibits  modularity.  Yet  even  forgetting  the  goal  of  separate  compilation, 
recursive  modules  seem  to  the  typical  ML  programmer  like  a  natural  generalization  of  recursive 
constructs  in  the  core  language.  Consider  a  simple  syntax.  If  one  can  write  mutually  recursive 
datatypes  separated  by  the  keyword  “and”  and  mutually  recursive  functions  separated  by  the  key¬ 
word  “and”,  why  can’t  one  write  mutually  recursive  modules  separated  by  the  keyword  “and”? 
{Note:  ML  actually  has  an  “and”  syntax  for  simultaneous  module  declarations,  but  such  decla¬ 
rations  are  not  mutually  recursive.)  Consider  a  simple  semantics.  Suppose  that  these  mutually 
recursive  modules  are  restricted  to  contain  only  datatypes  and  functions.  A  mutually  recursive 
bundle  of  modules  could  be  compiled  into  a  single  non-recursive  module  by  joining  the  datatypes  in 
all  the  modules  into  one  recursive  bundle  of  datatypes  and  joining  the  functions  in  all  the  modules 
into  one  recursive  bundle  of  functions. 

The  simplicity  of  this  “naive”  programmer’s  point  of  view  is  a  dual-edged  sword.  To  the  compiler 
hacker,  it  gives  the  impression  that  recursive  modules,  perhaps  in  a  limited  form,  would  be  an  easy 
extension  to  design  and  implement.  To  the  type  theorist,  it  makes  them  appear  like  so  much  ad 
hoc  syntactic  sugar.  From  either  angle,  it  does  not  suggest  a  fruitful  line  of  research,  which  is 
unfortunate  considering  the  perennial  demand  for  recursive  modules. 

1.1  Workarounds 

In  the  absence  of  recursive  modules,  there  are  two  major  techniques  that  programmers  typically 
employ  to  work  around  them.  The  first,  more  primitive,  technique  is  based  on  the  philosophy  that 
mutually  recursive  types  and  terms  ought  to  be  written  together  to  begin  with  and  “modularized 
after  the  fact”  by  copying  them  into  separate  modules.  For  example,  if  we  wanted  to  separate 
two  mutually  recursive  functions  f  and  g  into  separate  modules  A  and  B,  we  would  have  to  first 
define  them  together  in  one  module  with  a  fun  declaration,  then  copy  the  functions  into  A  and  B 
as  follows: 

structure  AandB  =  struct 

fun  f  X  =  . . .  g( . . . )  ... 

and  gy=  ...  f(...)  ... 

end 
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structure  A  =  struct  val  f  =  AandB.f  end 
structure  B  =  struct  val  g  =  AandB.g  end 


The  same  technique  can  be  applied  to  datatypes  by  means  of  ML’s  “datatype  copying”  mechanism. 

While  this  first  workaround  is  certainly  straightforward,  it  is  unsatisfying  for  two  reasons.  First, 
it  is  no  more  nor  less  than  a  namespace  management  trick  and  does  not  address  the  problem  of 
modularizing  the  actual  recursive  code.  Second,  if  the  module  is  large,  as  is  typical  for  the  kinds  of 
modules  one  would  like  to  use  recursive  modules  to  break  up,  copying  each  type  or  value  declaration 
into  a  separate  module  after  the  fact  can  be  quite  cumbersome. 

The  second  common  approach,  known  often  as  the  “parameterization”  technique,  allows  for  real 
separate  compilation  of  mutually  recursive  datatypes  and  functions,  but  is  in  turn  more  complex. 
There  are  a  number  of  variations  on  this  approach,  but  the  basic  idea  is  to  define  the  recursive 
modules  hierarchically  so  that  they  can  be  written  separately.  In  order  to  do  that,  the  functions 
in  the  first  module  in  the  hierarchy,  which  is  defined  in  the  absence  of  the  others,  must  be  pa¬ 
rameterized  over  the  functions  they  refer  to  from  the  other  modules.  When  referring  to  the  first 
module,  the  functions  in  the  other  modules  must  supply  themselves  (recursively)  as  arguments  to 
instantiate  the  import  dependencies  of  the  functions  in  the  first  module.  The  first  module  can  then 
be  defined  “for  real”  by  applying  its  original  parameterized  functions  to  the  functions  defined  in 
the  other  modules,  thereby  tying  the  recursive  knot  manually.  Continuing  the  simple  example  from 
the  first  technique,  we  could  do  the  following: 


structure  preA 
struct  fun  f 
structure  B  = 
struct  fun  g 
structure  A  = 
struct  val  f 


g  X  =  ...  g( . . . )  ...  end 

y  =  let  val  f  =  preA.f  g  in  ...  f(...) 
=  preA.f  B.g  end 


.  . .  end 


A  similar  technique  can  be  applied  to  datatypes  by  means  of  polymorphic  types  (which  are  really 
A-expressions  at  the  type  level). 

While  this  kind  of  approach  is  undeniably  effective  at  breaking  up  recursive  code,  it  also  forces 
considerable  changes  at  the  level  of  individual  functions  and  types,  not  to  mention  the  overall 
program  structure.  Every  function  in  A  that  needs  to  refer  to  B  must  explicitly  convert  its  import 
dependencies  (such  as  B.g  above)  into  function  arguments.  Code  transformations  like  this  that 
affect  all  the  code  in  a  module  are  what  modules  were  invented  to  simplify.  For  instance,  it  is  not 
necessary  to  program  with  functors  in  ML,  but  they  simplify  and  clarify  programming  tremendously 
in  the  situation  where  all  the  code  in  a  module  is  parameterized  over  the  code  in  another  module. 
Unfortunately,  there  is  no  way  to  generalize  the  recursive  parameterization  trick  to  the  module 
level,  e.g.: 


functor  preA  (B  :  BSIG)  = 

struct  fun  f  x  =  ...  B.g(...)  ...  end 
structure  B  = 

let  structure  A  =  preA(B)  in  (*  Type  Error:  B  not  defined  here  *) 
struct  fun  gy=  ...  A.f(...)  ...  end 

In  addition,  the  complexity  of  the  parameterization  technique  introduces  inefficiency  due  to  function 
applications  that  is  avoided  by  simply  writing  the  functions  in  all  the  modules  together  under  one 
fixed-point. 
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1.2  Type  Theory  for  Recursive  Modules 

In  response  to  the  inadequacies  of  the  recursive  module  workarounds,  there  have  been  a  few  attempts 
to  extend  ML-like  languages  with  real  recursive  modules,  most  notably  by  Flatt  and  Felleisen  [8] 
and  Duggan  and  Sourelis  [6,  7].  However,  the  former’s  extension  is  bound  up  with  dynamic  linking 
and  first-class  modules  in  the  concept  of  a  unit.  The  latter’s  extension  is  tied  to  mixin  modules, 
which  can  be  used  to  achieve  the  effect  of  virtual  types  in  a  functional  language.  Both  extensions 
are  interesting  (and  we  will  return  to  them  in  Section  6  on  related  work) ,  but  neither  explains  what 
a  recursive  module  is,  in  a  fundamental  sense  independent  of  the  specific  language  extension  in 
which  it  appears. 

Toward  this  end,  Crary,  Harper  and  Puri  [5]  (hereafter,  CHP)  have  given  an  analysis  of  recursive 
modules  in  the  setting  of  type  theory,  specifically  in  a  phase- distinction  calculus  in  the  style  of 
Harper,  Mitchell  and  Moggi  [9]  (hereafter,  HMM).  The  main  reason  for  working  in  such  a  calculus 
is  to  show  how  module  constructs  can  be  split  into  separate  compile-time  (type-level)  and  run¬ 
time  (term-level)  core  constructs.  This  ability  to  “phase-split”  a  module  is  important  for  several 
reasons.  Modern  type-directed  compilers  track  type  information  throughout  compilation  to  enable 
certain  type-based  optimizations.  Phase-splitting  enables  them  to  compile  module  code  into  core 
language  code  without  discarding  the  type  part  of  the  module.  In  addition,  working  in  the  style  of 
HMM  provides  compatibility  with  fully  transparent  higher-order  modules,  including  functors  and 
nested  structures.  Although  higher-order  modules  are  not  the  concern  of  this  paper,  we  will  see 
in  Section  5.2  that  the  ability  to  phase-split  functors  and  functor  applications  leads  to  increased 
expressiveness  for  recursive  modules. 

As  the  recursive  core  constructs  at  the  type  and  term  levels  of  the  core  calculus  take  the 
form  of  a  fixed-point,  the  analogous  module  construct  that  CHP  study  is  a  fixed-point  module. 
This  fixed-point  module  has  a  straightforward  typechecking  rule,  and  an  intuitive  phase-splitting 
interpretation  that  separates  the  module  into  a  fixed-point  type  constructor  and  a  fixed-point  term. 
However,  simple  examples  exhibit  serious  limitations  in  the  new  module  construct.  To  remedy  the 
problem,  CHP  introduce  a  new  signature  construct  called  a  recursively  dependent  signature,  or  rds, 
that  more  accurately  models  types  inside  a  recursive  module.  Rds’s  are  required  by  their  phase¬ 
splitting  interpretation  to  be  fully  transparent,  and  hence  the  combination  of  fixed-point  modules 
with  rds’s  is  termed  “transparent  recursive  module”  programming,  in  contrast  to  the  “opaque” 
programming  with  the  fixed-point  construct  alone. 

While  transparent  recursive  modules  constitute  a  useful  and  expressive  extension  to  the  module 
system,  there  is  a  serious  problem  with  the  entire  analysis.  The  phase-splitting  rules  for  both 
fixed-point  modules  and  rds’s  depend  on  the  assumption  that  the  recursive  core  construct  at  the 
type  level  is  equi-recursive,  i.e.  a  fixed-point  over  type  constructors  that  is  equal  to  its  unrolling, 
not  merely  isomorphic.  This  is  an  odd  assumption  since  recursive  datatypes  in  Standard  ML  are 
iso-recursive,  requiring  explicit  rolls  (and  unrolls)  into  (and  out  of)  the  fixed-point.  Moreover, 
checking  type  equivalence  in  the  presence  of  equi-recursive  type  constructors  of  higher  kind  is  not 
known  to  be  decidable,  thus  casting  doubt  on  the  practicality  of  the  whole  type  system. 

1.3  Overview 

In  this  paper,  we  will  begin  with  the  rigorous  analysis  and  technical  infrastructure  afforded  by  CHP 
and  show  what  is  needed  to  make  recursive  modules  useful  and  practical.  Our  approach  to  the 
theory  involves  determining  which  constructs  are  fundamentally  equi-recursive  and  which  are  not, 
with  the  ulterior  goal  of  developing  a  practical  design  for  recursive  modules  that  depends  only  on 
iso-recursive  type  constructors. 
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Figure  1:  The  Core  Calculus 


First,  in  Section  2,  we  review  the  definitions  and  motivations  for  the  fixed-point  and  rds  con¬ 
structs  introduced  by  CHP,  as  well  as  the  type-theoretic  framework  in  which  they  are  situated. 
Then,  in  Section  3,  we  split  fixed-points  and  rds’s  into  opaque  and  transparent  forms  in  an  attempt 
to  isolate  the  uses  of  equi-recursive  constructors  and  to  clarify  the  similarities  and  differences  be¬ 
tween  opaque  and  transparent  recursive  modules.  In  Section  4,  we  examine  how  well  the  theoretical 
constructs  defined  in  Section  3  fare  in  practice,  culminating  in  a  proposal  for  a  modest  recursive 
module  extension  to  Standard  ML.  Finally,  Section  5  explores  some  directions  for  future  work  re¬ 
lated  to  the  interaction  of  recursive  modules  and  abstraction,  and  Section  6  provides  a  brief  survey 
of  related  work. 

2  Summary  of  CHP  Analysis 

In  this  section,  we  will  briefly  summarize  the  type-theoretic  framework  of  CHP,  the  new  constructs 
they  introduce  and  the  problems  that  arise. 

2.1  Framework 

The  analysis  is  conducted  in  a  phase-distinction  calculus,  consisting  of  a  core  calculus  and  a  prim¬ 
itive  structure  calculus.  The  core  calculus  (Figure  1)  is  a  fairly  standard  higher-order  predicative 
polymorphic  lambda  calculus  with  a  few  extensions,  namely  dependent  and  singleton  kinds,  equi- 
recursive  type  constructors  and  fixed-point  expressions. 

Dependent  and  singleton  kinds  have  proven  to  be  a  convenient  and  expressive  mechanism  for 
fine-grained  control  of  type  information  in  module  signatures.  Specifically,  the  kind  5(c)  classifies 
all  types  (actually,  type  constructors  of  base  kind  T)  that  are  equivalent  to  c.  The  kind  T,a:Ki.K2 
is  used  to  classify  pairs  of  constructors  where  the  kind  of  the  second  constructor  may  refer  to  the 
name  of  the  first  constructor.  The  kind  Iia:Ki.K2  is  used  to  classify  constructor  functions  where  the 
result  kind  may  refer  to  the  name  of  the  argument.  To  illustrate  all  three,  the  following  faux-ML 
functor  signature 

(X  :  sig  type  t  end)  ->  sig  type  ’a  u;  type  v  =  X.t  *  int  u  end 
could  be  translated  into  the  kind 


Iia:T.  S/3:r^r.S(a  x  (3{int)) 

As  shown  in  Figure  2,  one  can  define  singletons  at  higher  kind,  written  5 (c  :  k),  inductively  in  terms 
of  singletons  at  kind  T.  The  thesis  work  of  Chris  Stone  [26]  explores  the  issue  of  singleton  kinds 
in  great  depth,  including  a  detailed  proof  that  type  equivalence  remains  decidable  in  a  language 
containing  them.  Dependent  and  singleton  kinds  are  currently  implemented  in  the  MIL  intermediate 
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S(c  :  T) 
S(c:S(c')) 
S(c  :  Iia:Ki.K,2) 
5(c  :  1) 
S(c  :  Sa:Ki.K2) 

Figure 


=  5(c) 

=  5(c) 

“=  nQ;:Ki.5(cQ;  :  K2) 

def  ^ 

“=  5(c.l  :  Ki)  X  5(c.2  :  «;2[c.1/q;]) 
2:  Higher-Order  Singletons 


language  of  the  TILT  compiler  for  Standard  ML  [19].  See  Stone  and  Harper  [25]  for  a  more  concise 
account  of  the  motivation  and  practicality  of  singleton  kinds. 

The  recursive  operators  at  the  constructor  and  term  levels  take  the  form  ^=a:K.c  and  fix{x:a.e), 
respectively.  The  type  constructor  fi=a:K.c  denotes  the  unique  fixed-point  of  the  constructor  func¬ 
tion  Xa:K.c  and  is  subject  to  the  following  well-formedness  rule: 

a  0  Dom(r)  T  h  K  kind  r[a  t  /«]  b  c  |  k 
T  h  ^=a:K.c  :  k 

The  second  premise  checks  that  c  has  the  right  kind  and  that  it  is  contractive  in  a  context  where  a 
is  not.  In  the  context  of  this  rule,  this  means  that  all  references  in  c  to  a  must  occur  under  a  type 
construction  operation  like  ax  a.  In  general,  contractiveness  can  be  viewed  as  a  way  of  ensuring 
that  there  exists  a  unique  fixed-point  for  \a:K.c.  For  instance,  the  non-contractive  identity  function 
Xa:K.a  has  no  unique  fixed-point  because  every  constructor  of  kind  k  is  a  fixed-point  of  the  identity. 

The  notation  =  in  //=  indicates  that  the  recursive  type  constructor  is  equi-recursive,  as  opposed 
to  iso-recursive  constructors,  which  we  will  denote  with  Equi-recursive  constructors  obey  the 
following  “equational  unrolling”  rule,  which  exhibits  the  fact  that  fi=  is  a  fixed-point  (here,  as  in 
the  rest  of  this  paper,  we  write  E'[Ei,  ■  ■  ■  ,EnfXi,  ■  ■  ■  ,X„]  to  indicate  the  simultaneous  capture¬ 
avoiding  substitution  of  Ei,  •  •  • ,  for  Xi,  •  •  • ,  X^  in  E'): 

a  0  Dom(F)  F  h  k  kind  F[a  t  /«]  b  c  |  k 
F  b  ia=a:K.c  =  c  [fj,a:K.c/a]  :  k 

Uniqueness  is  guaranteed  by  the  following  “bisimilarity”  rule,  which  says  that  any  constructor  that 
is  a  fixed-point  of  Xa:K.c  is  equivalent  to  the  fi=: 

a  0  Dom(F)  F  b  «:  kind  F  b  c'  =  c[cYa]  :  k  F[q;  j  k]  b  c  |  k 

F  b  c'  =  fra:K.c  :  k 

The  fixed-point  equations  induced  by  unrolling  and  bisimilarity  make  ix=  apposite  for  CHP’s  and 
our  theoretical  development  of  recursive  modules,  as  we  will  see  in  the  next  section.  However, 
having  a  true  fixed-point  operator  in  our  core  calculus  complicates  the  procedure  for  deciding 
constructor  equivalence  to  the  point  that  it  is  not  known  to  be  decidable.  We  will  study  iso¬ 
recursive  constructors,  which  complicate  the  theory  but  are  more  practical,  in  Section  4.2.1. 

The  term-level  construct  fix{x:a.e)  enables  the  definition  of  recursive  values.  The  well-formedness 
rule  for  fixed-point  expressions  is  similar  to  the  rule  for  equi-recursive  constructors: 

x  0  Dom(F)  F  b  fj  type  F[x  ]  a]\-  e  I  a 
F  h  fix{x:a.e)  :  a 
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signatures  S  ::=  [a:K,  cr] 

modules  M  ::=  s|[c,  e] 

eonstruetors  c  ::=  •••  \FstM 

terms  e  ::=  •••  \  Snd  M 

eontexts  T  ::=  •  •  •  |  r[s  :  S]  \  r[s  I  S] 

Figure  3:  The  Structure  Calculus 

The  last  premise  involves  a  valuability  check  analogous  to  the  eontraetiveness  check.  Essentially, 
it  must  be  statically  checkable  that  the  evaluation  of  e  will  terminate  without  side-effects,  and 
references  in  e  to  x  must  occur  inside  A-abstractions  (i.e.  functions).  The  fix  construct  is  basically  a 
generalization  of  ML’s  fun  declarations  to  allow  valuable  expressions  to  be  included  in  the  recursive 
definition. 

The  primitive  structure  calculus  (Figure  3)  consists  of  two  new  syntactic  classes:  modules  M 
and  signatures  S.  The  module  class  is  inhabited  by  a  primitive  phase-split  module  construct  and 
the  signature  class  by  a  primitive  phase-split  signature  construct.  Each  phase-split  construct  is 
composed  directly  from  core  constructs.  A  phase-split  module  [c,  e]  consists  of  a  type  constructor 
c  and  a  term  e.  Sometimes  we  will  use  the  notation  [a  =  c,  e]  as  shorthand  for  [c,  e[c/Q;]].  A 
phase-split  signature  [a:K.a]  classifies  a  phase-split  module,  under  the  following  typing  rule: 

a  0  Dom(F)  F  h  c  :  k  F  h  e  :  a[c/a\  F[q;  :  n]\-  a  type 

F  h  [c,  e]  :  [a:K.a] 

The  “static”  (or  “compile-time”  or  “constructor”)  part  of  a  phase-split  module  can  be  extracted 
through  a  new  form  of  type  constructor  Fst  M,  and  the  dynamic  part  (or  “run-time”  or  “term” ) 
part  can  be  extracted  through  a  new  form  of  expression  Snd  M.  They  are  subject  to  the  following 
typing  rules: 

F  h  M  :  [a'-K.a]  F  h  M  :  [a:K.a] 

Th  Fst  M  :k  Th  Snd  M  :  a[Fst  M/a] 

Valuability  for  modules  is  embodied  in  the  judgement  F  h  M  J,  S',  which  ensures  that  the  static 
part  of  the  module  is  contractive  and  the  dynamic  part  is  valuable: 

a  0  Dom(F)  FI-cJ,k  FheJ,  a[c/a\  F[q;  :  k]\-  a  type 

F  h  [c,  e]  i  [a-.K.a] 

HMM  show  that  higher-order  modules,  such  as  functors  and  nested  structures,  are  already 
present  in  the  primitive  structure  calculus  described  here.  We  may  thus  assume  compatibility  with 
higher-order  modules  while  working  within  a  simpler  system.  The  method  HMM  employ  is  to  define 
typing  rules  for  higher-order  module  and  signature  constructs,  then  to  equate  those  constructs  with 
primitive  forms  through  non-standard  equational  rules  on  modules  and  signatures.  As  we  shall  see 
below,  CHP  use  the  same  technique  to  define  the  phase- splitting  interpretation  of  the  recursive 
module  and  signature  constructs  they  introduce  in  terms  of  core  constructs. 

The  full  set  of  typing  rules  for  the  core  and  structure  calculi  can  be  found  in  Appendices  A.l 
and  A. 2,  respectively. 
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2.2  Fixed-Point  Modules 

CHP  study  two  styles  of  recursive  module  programming:  opaque  and  transparent.  Both  utilize  a 
simple  recursive  module  construct  analogous  to  the  recursive  constructs  for  constructors  and  terms 
in  the  core  calculus:  the  fixed-point.  CHP’s  fixed-point  module  takes  the  form  fix{s:S.M),  and 
obeys  the  following  typing  rule,  typical  for  a  fixed-point: 

s  0  Dom(r)  r[s  t  5]  b  M  i  5  P  h  5 
P  h  fix{s:S.M)  :  S 

The  module  body  is  required  to  be  valuable  in  a  context  where  the  recursive  module  variable  is 
not.  This  valuability  restriction  is  essentially  the  same  as  the  restriction  on  polymorphic  bindings 
imposed  in  the  revised  1997  definition  of  Standard  ML[16],  but  here  it  is  applied  to  all  the  bindings 
in  the  body  of  a  fixed-point,  not  just  the  polymorphic  ones.  Note  that  this  restriction  is  rather 
conservative  in  the  sense  that  there  are  perfectly  acceptable  recursive  modules  with  non-valuable 
bodies,  e.g.  those  containing  the  binding  val  x  =  ref  2.  The  practical  implications  of  this  con- 
servativity  are  not  really  the  focus  of  CHP  or  this  paper,  but  we  will  return  to  the  issue  briefly  in 
Section  4.4. 

Phase-splitting  for  fixed-point  modules  works  as  follows.  Note  that  and  are  used  here  to 
signify  the  compile-time  and  run-time  components,  respectively,  of  primitive  structure  variable  s 
after  phase-splitting,  but  they  are  merely  specially-named  variables,  unrelated  to  s. 

a,  s,  s^,  s''  0  Dom(r)  P  h  k  kind  r[a  :  k]\-  a  type 
r[s'^  ]  k]\-  c  [  K  r[s‘^  :  k][s'’  t  b  e  I  a[c/a\ 

P  b  fix{s\[a\K.a'].[c[Fst  s/ s^],e[Fst  s,  Snd  s/s'^,  s'']])  = 

[s'^  =  fj,=s'^:K.c, fix{s'':a[s'^ /a].e)]  :  [a:K.a\ 

Essentially,  phase-splitting  a  fixed-point  module  wraps  an  equi-recursive  constructor  around  the 
static  part  c  and  a  term-level  fixed-point  around  the  dynamic  part  e.  It  should  be  noted  that  the 
use  of  an  equi-recursive  constructor  here  is  necessary  for  the  term-level  fixed-point  to  typecheck. 
Specifically,  e  has  type  a[c/a\  but  it  needs  to  match  the  type  of  the  fixed-point,  which  is  a[s'^/a\. 
With  s'^  as  an  abbreviation  for  fi=s'^:K.c,  it  is  clear  that  c  (with  free  references  to  s'^)  is  the  unrolling 
of  s''.  Thus,  the  equality  of  (T[c/a]  and  cj[s‘'/a]  is  predicated  on  the  equality  of  s"  and  its  unrolling, 
i.e.  on  the  use  of  the  equi-recursive  p.=  . 

2.3  Opaque  Recursive  Modules 

What  CHP  call  opaque  reeursive  modules  are  simply  the  fixed-point  modules  presented  above,  with 
no  other  extensions  to  the  type  system.  These  modules  per  se  are  termed  opaque  because  the 
only  information  afforded  the  body  M  about  the  recursive  module  variable  s  in  fixed-point  module 
fix{s:S.M)  is  the  signature  S.  It  should  be  understood  that  making  S  opaque  is  not  an  abstraction 
mechanism.  It  does  not  hide  the  implementation  of  types  defined  inside  the  fixed-point  from  code 
outside  the  module,  because  the  fixed-point  module  can  always  be  phase-split  and  its  static  part 
revealed.  (In  practice,  such  abstraction  can  be  achieved  by  opaquely  ascribing  S  to  the  entire 
fixed-point,  e.g.  fix{-  ■  •)  :>  S.)  Rather,  it  merely  hides  some  of  the  static  part  of  s  from  the  code 
in  the  body  M  of  the  module. 

What  purpose,  then,  does  this  opacity  serve?  In  cases  where  the  fixed-point  is  used  solely  to 
write  mutually  recursive  function  definitions  that  cross  module  boundaries,  there  is  no  reason  not 
to  make  S  transparent.  Trouble  arises,  however,  when  one  wishes  to  define  mutually  recursive  type 
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definitions  that  cross  module  boundaries.  In  order  to  encode  such  definitions,  at  least  one  of  them 
must  be  defined  using  a  module-recursive  type  constructor^,  i.e.  a  type  constructor  in  the  body  of 
the  fixed-point  that  refers  to  the  recursive  module  variable.  The  issue  is  that  a  module-recursive 
constructor  definition  like  type  t  =  c,  where  c  refers  to  s,  cannot  be  realized  in  the  signature  S, 
since  S  must  be  valid  in  the  ambient  context  that  does  not  contain  s.  Thus,  making  t  opaque  in  S 
is  a  seemingly  necessary  evil  attendant  upon  module-recursive  constructor  definitions. 

A  simple  example  of  how  the  opacity  forced  upon  module-recursive  constructor  definitions 
causes  a  problem  is  the  following  failed  attempt  to  use  an  opaque  recursive  module  to  implement 
a  (recursive)  integer  list  type  through  the  recursive  module.  (Assume  that  the  core  language  has 
been  extended  with  binary  sum  types  and  the  corresponding  standard  term-level  constructors,  ini 
and  inr,  and  destructor,  case.  Note  also  that  we  will  use  a  sort  of  pidgin-ML  syntax  to  convey 
the  examples,  but  this  is  intended  to  represent  code  in  our  internal  language  and  make  it  more 
readable,  not  to  imply  extensions  to  ML  itself.) 

signature  LIST  = 
sig 

type  t 
val  nil  :  t 

val  cons  :  int  *  t  ->  t 
end 

structure  List  =  fix  (List  :  LIST, 
type  t  =  1  +  int  *  List.t 
val  nil  :  t  =  ini  () 
fun  cons  (x  :  int,  L  :  List.t)  :  t  = 
inr  (x,L) 

) 

In  this  example,  the  body  of  the  fixed-point  is  well-typed,  but  the  cons  function  has  type  int  * 
List.t  ->  t,  not  the  required  int  *  t  ->  t,  so  the  fixed-point  is  not  valid.  Alternatively,  one 
could  try  assigning  L  the  type  t  in  the  definition  of  cons,  but  this  clearly  fails  as  well  because  the 
inr  requires  int  *  List.t,  not  int  *  t. 

Another  good  example  given  by  CHP  is  the  canonical  ExpDec  example.  The  idea  is  that  one 
would  like  to  separate  two  syntactic  classes  in  a  mutually  recursive  abstract  syntax  tree,  such  as 
expressions  (Exp)  and  declarations  (Dec),  into  two  mutually  recursive  modules.  A  similar  problem 
arises  here  as  in  the  List  example.  In  addition,  the  ExpDec  example  illustrates  problems  in  pro¬ 
gramming  with  mutually  recursive  modules  that  do  not  affect  the  single  recursive  module  of  List. 
However,  the  distinction  between  List  and  ExpDec  is  not  the  focus  of  the  core  analysis  of  CHP,  so 
we  will  not  be  concerned  with  the  details  of  ExpDec  until  Section  3.2. 

2.4  Transparent  Recnrsive  Modnles  and  Recursively  Dependent  Signatures 

The  problem  with  opaque  recursive  modules  boils  down  to  the  inability  to  make  any  correlation 
between  the  implementation  of  a  type  in  the  body  (e.g.  t)  and  the  corresponding  type  component 
of  the  recursive  module  variable  (e.g.  List.t),  when  the  type  is  left  abstract  in  the  signature  of 
the  recursive  module  variable.  CHP  suggest  that  the  way  to  remedy  this  is  to  somehow  make 
the  signature  transparent,  so  that  the  module-recursive  type  definitions  in  the  body  are  reflected 

^Module-recursive  is  our  term,  not  CHP’s. 


in  the  signature.  As  discussed  above,  naively  realizing  the  signature  with  the  module-recursive 
type  definition  would  result  in  an  ill-formed  signature  with  free  references  to  the  recursive  module 
variable  List.  Luckily,  the  effect  of  such  a  realization  can  be  achieved  through  a  more  complex 
signature  mechanism  introduced  by  CHP:  a  recursively  dependent  signature,  or  rds. 

An  rds  takes  the  form  ps'^.S,  where  the  constructor  variable  s'^  is  bound  in  signature  S.  (Actu¬ 
ally,  CHP’s  rds  construct  takes  the  form  ps.S,  where  s  is  a  module  variable,  but  since  a  signature 
can  only  contain  references  to  the  static  part  of  s,  the  two  constructs  amount  to  the  same  thing.) 
Intuitively,  a  type  definition  in  S  that  refers  to  s'^  is  expressing  a  module-recursive  type  not  express¬ 
ible  in  normal  signatures;  can  essentially  be  viewed  as  a  placeholder  for  the  static  part  of  the 
module  to  which  the  rds  is  ascribed.  The  introduction  and  elimination  rules  for  rds’s  are  simple, 
but  somewhat  unusual.  Applications  of  these  rules  can  be  thought  of  as  roll  and  unroll  operations, 
respectively,  at  the  module  level: 

PPM:  S[Fst  M/s^]  P  h  psFS  sig  PPM:  psFS 

PPM:  psFS  PPM:  S[Fst  M/s^] 

Now  consider  our  List  example  again.  We  can  close  the  free  module-recursive  references  in  the 
fully  transparent  version  of  the  LIST  signature  by  binding  List  in  an  rds: 

p  List . 
sig 

type  t  =  1  +  int  *  List.t 
end 

When  the  recursive  module  variable  List  is  assigned  the  above  rds,  the  rds  elimination  rule  allows 
List  to  be  given  the  unrolled  form  of  this  rds,  with  free  references  to  itself.  Projecting  from 
the  unrolled  signature,  it  is  then  apparent  that  List.t  =  1  +  int  *  List.t,  which  equals  the 
definition  of  t  in  the  body  of  the  fixed-point. 

What  this  implies  about  rds’s  is  that  their  interpretation  into  the  core  language  requires  the 
use  of  equi-recursive  constructors  to  “solve”  the  module-recursive  type  equations  introduced  by  the 
roll/unroll  rules.  The  phase-splitting  rule  for  rds’s  achieves  this  by  wrapping  the  fully  transparent 
static  part  of  an  rds  with  a  p=  by  means  of  singleton  kinds: 

a,  s^  0  Dom(r)  T  P  k  kind  r[s'^  |  k]\-  c  [  n  r[s'’  :  k][q;  :  k]  P  <t  type 
T  P  /3s'^.[a:S(c  :  k).cj]  =  [s^:5{p=s^:k.c  :  k).it[s‘^/q;]]  sig 

The  way  rds  phase-splitting  works  requires  the  body  of  the  rds  to  be  fully  transparent.  Hence, 
the  combination  of  a  fixed-point  module  and  an  rds  is  termed  a  transparent  recursive  module.  The 
typechecking  rule  for  rds’s  reflects  the  full  transparency  requirement: 

a,  s'^  0  Dom(r)  T  P  k  kind  r[s'’  :  k]  P  5  =  [a:5{c  :  k).cj]  sig  r[s'^  I  k]  P  c  |  k 

T  P  psFS  sig 


3  Recursive  Modules  in  Theory 

The  recursive  module  dichotomy  as  we  have  presented  it  is  somewhat  unsatisfying  because  the 
distinction  between  opaque  and  transparent  recursive  modules  is  not  entirely  clear.  Consider  the 
following:  Both  idioms  use  the  same  fixed-point  module  construct,  but  transparent  modules  use  fully 
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transparent  rds’s  while  opaque  modules  use  arbitrary  non-rds  signatures.  Opaque  recursive  modules 
appear  not  to  be  able  to  encode  module-recursive  type  constructors,  rendering  them  inexpressive 
in  comparison  to  transparent  recursive  modules.  However,  since  rds’s  can  be  interpreted  through 
phase-splitting  as  primitive  signatures,  it  would  seem  that  transparent  recursive  modules  are  really 
a  subset  of  opaque  ones! 

In  fact,  this  paradox  is  easily  resolved  by  a  clarification  of  what  is  meant  by  a  “non-rds  signa¬ 
ture”.  When  we  assert  that  the  signature  in  an  opaque  recursive  module  is  not  an  rds,  it  is  not 
enough  to  merely  prohibit  the  signature  from  having  the  syntactic  form  ps'^.S]  the  point  is  that  the 
signature  should  not  require  the  expressive  power  necessary  to  encode  transparent  definitions  of 
module-recursive  type  constructors.  In  our  system,  module-recursive  constructors  are  implemented 
with  equi-recursive  constructors,  so  the  precise  definition  of  a  non-rds  signature  is  one  whose  phase- 
split  form  is  expressible  without  the  use  of  equi-recursive  constructors.  Now  there  is  no  question 
that  transparent  modules  have  the  expressive  edge. 

That  this  confusion  in  terms  was  glossed  over  by  CHP  indicates  an  implicit  assumption  in  their 
presentation  of  recursive  modules:  namely,  that  the  notion  of  a  module-recursive  constructor  in 
fixed-points  and  rds’s  encapsulates  the  ways  that  equi-recursiveness  is  used,  so  the  programmer  who 
is  equipped  with  these  recursive  module  constructs  should  not  need  to  make  explicit  use  of  equi- 
recursive  constructors.  It  is  reasonable  to  want  to  isolate  equi-recursiveness  in  the  phase-splitting 
rules  for  fixed-points  and  rds’s.  The  core  language  of  ML  allows  only  a  very  restricted  form  of 
recursive  type  definition  with  the  datatype  mechanism,  whose  elaboration  only  requires  the  use  of 
iso-recursive  constructors.  Ultimately,  we  will  not  want  to  rely  on  constructs  that  are  implemented 
with  equi-recursive  constructors,  so  it  is  unfortunate  that  the  phase-splitting  interpretations  of  both 
fixed-point  modules  and  rds’s  seem  to  require  them. 

In  this  section,  we  will  reassess  the  idioms  of  opaque  and  transparent  recursive  module  program¬ 
ming  by  defining  them  in  terms  of  opaque  and  transparent  variants,  respectively,  of  the  fixed-point 
module  and  rds  constructs.  Splitting  fixed-points  and  rds’s  this  way  will  help  to  isolate  and  clarify 
the  uses  of  equi-recursiveness  in  the  opaque  and  transparent  idioms.  In  addition,  we  will  state  a 
“fundamental”  property  of  recursive  modules  unifying  the  two  idioms  and  prove  that  they  both 
adhere  to  it.  We  end  the  section  with  some  remarks  on  the  relevance  of  our  theoretical  analysis. 

3.1  Opaque  Fixed-Point  Modules 

Recall  the  List  example  from  Section  2.3.  The  problem  there  was  not  that  the  body  of  the  fixed- 
point  was  ill-typed,  but  that  its  signature  contained  references  to  the  recursive  module  variable  List 
and  thus  did  not  match  the  required  signature  LIST.  Specifically,  the  type  of  the  cons  function  in 
the  body  was  int  *  List.t  ->  t,  instead  of  int  *  t  ->  t  as  in  LIST.  Solving  this  problem  by 
making  the  LIST  signature  transparent  means  that  not  only  are  the  actual  and  required  types  for 
cons  identified,  but  t  and  List.t  are  identified  during  typechecking  of  the  body.  This  solution  is 
unnecessarily  generous  because  the  opacity  of  List  .t  did  not  cause  a  problem  during  typechecking, 
only  when  matching  the  body’s  signature  against  the  required  signature. 

This  suggests  a  generalization  of  the  fixed-point  module  rule.  If  the  recursive  module  variable  s 
is  given  signature  [a:K.cJi],  then  allow  the  body  to  have  signature  [a:n.a2]-,  where  ui  =  a2[oi/ Fst  s]. 
(We  view  the  signatures  here  in  their  phase-split  form  in  order  to  have  access  to  the  a  that  gets 
substituted  for  Fst  s;  in  Section  3.4  we  will  see  how  this  can  be  avoided.)  For  instance,  the  List 
example  would  now  typecheck,  because  if  we  substitute  t  for  occurrences  of  List.t  in  the  actual 
type  of  cons  (int  *  List.t  ->  t),  we  obtain  the  required  type  (int  *  t  ->  t).  Note,  however, 
that  the  signature  of  s  is  still  opaque  (i.e.  k  can  be  anything)  during  typechecking. 
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Let’s  call  this  generalization  an  opaque  fixed-point  module  and  denote  it  with  fixp.  The  type¬ 
checking  rule  is  as  one  would  expect  from  the  description  above: 

a,  s  0  Dom(r)  T  h  5  =  [a-.n.ai]  sig 
r[s  t  -S']  h  M  I  [a\K.(j2]  r[a  ■.  k]\-  cfi  =  a2[ex/ Fst  s]  type 
r  h  fixQ{s:S.M)  :  S 

This  generalization  only  makes  sense,  of  course,  if  it  does  not  adversely  affect  phase-splitting. 
Fortunately,  it  does  not  affect  phase-splitting  at  all.  In  fact,  the  idea  for  the  generalization  came 
from  looking  at  what  premises  of  the  phase-splitting  rule  for  fixed-points  could  be  safely  relaxed. 
The  intuitive  reason  the  opaque  fixed-point  works  is  that  the  equi-recursive  constructor  used  in 
phase-splitting  turns  the  static  part  of  the  recursive  module  variable  and  the  static  part  of  the 
body  into  the  same  thing.  E.g.,  after  phase-splitting,  List.t  becomes  p,=a.l  +  int  x  a  and  t 
becomes  1  -|-  int  x  List.t,  which  are  equal  since  qi=  is  equi-recursive.  For  completeness,  here  is 
the  phase-splitting  rule  for  opaque  fixed-points,  which  differs  from  the  original  phase-splitting  rule 
only  in  the  more  flexible  premises: 

a,  s,  s^,  s''  0  Dom(r)  T  h  k  kind  r[a  :  k]\-  ai  type  r[s'’  :  K][a  :  k]\-  cj2  type 
r[s'’  ]  k]\-  c  [  K  r[s‘’  :  k][s^  t  b  e  J,  cJ2[c/a]  r[a  :  k]  h  cJi  =  cr2[a;/s'’’]  type 

T  h  fixo{s:[a:K.ai].[c[Fst  s/ e[Fst  s,  Snd  s/s"^,  s'"]])  =  [s''  =  fi=s'^:K.c,  fix{s'^ :ai[s’^ / a].e)]  :  [a:«:.cJi] 

Note  that  the  fixed-point  expression  on  the  right-hand  side  of  the  equation  typechecks  because 
t2[c/a]  =  (J2[s'^/a]  (where  =  ij,=s’^:k.c),  and  afis'^/a]  =  cJ2[a/s'^][s'^/a]  =  (T2[s‘^/a]. 


3.2  Opaque  Rds’s 


Although  the  above  generalization  of  the  fixed-point  construct  solves  the  problem  with  opaque 
recursive  modules  in  the  List  example,  it  does  not  entirely  solve  the  similar  problem  in  CHP’s 
ExpDec  example.  We  will  now  examine  the  details  of  ExpDec  because  they  motivate  the  idea  of  an 
opaque  rds. 

In  ExpDec,  we  essentially  want  to  break  up  a  pair  of  mutually  recursive  types  exp  and  dec, 
representing  abstract  syntax  for  expressions  and  declarations  in  some  object  language,  and  put 
them  in  mutually  recursive  modules  Exp  and  Dec,  respectively.  In  the  absence  of  rds’s,  this  is 
impossible  —  even  if  exp  and  dec  are  kept  abstract,  the  types  of  some  of  exp’s  constructors  will 
need  to  refer  to  dec  and  vice  versa,  and  there  is  no  way  to  express  this  in  typical  hierarchical 
signatures  (e.g.  the  boxed  types  are  mutually  recursive  references,  the  first  of  which  is  illegal): 


signature  EXPDEC 
structure  Exp 
type  exp 
val  make_let 
...  (*  other 


=  sig 
sig 


Dec . dec 


constructors 


exp  ->  exp 

and  destructors  *) 


end 

structure  Dec 
type  dec 
val  make_val 
...  (*  other 


sig 


identifier  * 


Exp. exp  ->  dec 


constructors  and  destructors  *) 


end 

end 


(*  let  DEC  in  EXP  end  *) 


(*  val  ID  =  EXP  *) 
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CHP  observe  that  this  problem  does  not  arise  when  signatures  are  written  in  phase-split  form 
because  the  recursive  dependencies  are  dynamic- on- static  (i.e.  terms  on  types),  not  static- on- static 
(i.e.  types  on  types).  Once  the  static  and  dynamic  parts  of  the  signature  have  been  separated, 
dynamic-on-static  dependencies  are  no  longer  recursive.  Indeed,  there  was  no  problem  in  the  List 
example  because  the  LIST  signature  was  naturally  in  phase-split  form. 

Although  writing  signatures  in  phase-split  form  solves  the  problem,  it  is  not  reasonable  to  force 
code  to  be  written  in  phase-split  style  in  general.  CHP  illustrate  a  failed  attempt  to  avoid  this  by 
declaring  exp  and  dec  in  the  signatures  of  both  Exp  and  Dec,  all  opaquely: 

signature  EXPDEC_CHP  =  sig 
structure  Exp  :  sig 
type  exp 
type  dec 

val  make_let  :  dec  *  exp  ->  exp 
end 

structure  Dec  :  sig 
type  dec 
type  exp 

val  make_val  :  identifier  *  exp  ->  dec 
end 
end 

This  rids  the  signature  of  recursive  references,  but  now  there  is  no  connection  between  the  types 
Exp. exp  and  Dec. exp  (and  Dec. dec  and  Exp. dec).  Thus,  syntax  trees  constructed  with  the 
datatype  constructors  in  Exp  are  incompatible  with  those  constructed  with  the  datatype  construc¬ 
tors  in  Dec,  rendering  a  module  with  this  signature  useless. 

Our  solution  is  to  define  an  opaque  variant  of  the  rds  construct,  denoted  p,  that  only  allows 
dynamic-on-static  recursive  dependencies.  (The  original  rds  construct,  which  we  will  continue  to 
denote  with  p,  is  by  its  very  nature  transparent.)  First,  the  typechecking  rule: 

a,  s"^  0  Dom(r)  T  h  k  kind  r[s'^  :  k]  h  S'  =  [a:K.a]  sig 

T  h  ps'^.S  sig 

The  restriction  on  recursive  dependencies  is  enforced  by  requiring  the  static  part  n  of  signature  S  to 
be  well-formed  in  the  ambient  context  (so  S  cannot  contain  static-on-static  references  to  s'^).  The 
phase-splitting  rule  makes  clear  that  there  is  nothing  fundamentally  recursive  about  this  opaque 
rds,  in  contrast  to  the  fully  transparent  rds  defined  previously: 

a,  s"^  0  Dom(r)  T  h  k  kind  r[s'^  :  /t]  [a  :  k]  h  a  type 
T  h  ps'^ .[a-.K.a]  =  [s'^:K.a[s'^ / a\\  sig 

With  the  opaque  rds,  we  can  “close”  the  dynamic-on-static  dependencies  desired  in  the  EXPDEC 
signature  as  follows: 

p  ExpDec . 
sig 

structure  Exp  :  sig 
type  exp 

val  make_let  :  ExpDec . Dec . dec  *  exp  ->  exp 
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end 

structure  Dec  :  sig 
type  dec 

val  make_val  :  identifier  *  <ExpDec . >Exp . exp  ->  dec 
end 
end 

The  angle  brackets  around  ExpDec  in  the  type  of  Dec .  make.val  signify  that  the  reference  to  Exp .  exp 
could  be  chosen  to  be  local  or  recursive,  and  the  signature  would  phase-split  the  same  either  way. 

3.3  Transparent  Fixed-Point  Modnles 

In  the  spirit  of  isolating  equi-recursiveness,  one  may  wonder  whether  the  transparent  recursive 
module  idiom  requires  the  power  of  equi-recursive  constructors  in  both  its  fixed-point  and  rds 
constructs.  Interestingly,  the  answer  is  no. 

Any  fixed-point  module  with  transparent  signature  is  equivalent  to  a  module  of  the  form 
fix{s:[a:5{c  :  K).a].[c\  e]).  After  phase-splitting,  the  static  part  of  the  module  is  iJ,=a:5{c  :  K).c'[a/ Fst  s]. 
Since  the  kind  of  this  constructor  is  S(c  :  k),  it  is  provably  equal  to  c!  So,  in  fact,  equi-recursive 
constructors  are  not  needed  here.  This  leads  us  to  observe  that  the  notion  of  a  transparent  fixed- 
point  module  —  that  is,  a  fixed-point  module  with  fully  transparent  signature  —  is  orthogonal  to 
that  of  a  transparent  rds  and  would  benefit  from  being  treated  separately.  This  does  not  imply, 
of  course,  that  the  examples  of  transparent  recursive  modules  presented  in  CHP  do  not  require 
equi-recursive  constructors  in  their  interpretation.  It  simply  means  that  the  equi-recursiveness  is 
completely  encapsulated  in  the  notion  of  a  transparent  rds  and  those  examples  require  the  use  of 
rds’s. 

We  will  denote  the  transparent  fixed-point  module  construct  with  fixrp.  The  typechecking  and 
phase-splitting  rules  are  as  one  would  expect: 

s  0  Dom(r)  r[s  t5]l-MJ,5  ri-5  =  [a:5(c  :  rfi.a]  sig 
r  h  fixrp{s:S.M)  :  S 

a,  s,  s^,  s'"  0  Dom(r)  T  h  5(c  :  k)  kind  r[Q;  :  S(c  :  «:)]  h  a  type 
r[s'^  t  S(c  :  k)]  b  c'  I  S(c  :  k)  :  S(c  :  k)][s'’  T  b  e  J,  a[c' /a] 

T  b  fixrp[s-.[a-.5{c  :  K).a].[c'[Fst  s/s’^],  e[Fst  s,  Snd  s/s'^,  s'"]]) 

=  =  c,  fix{s'^ :a[s’^ / a].e)]  :  [a:S(c  :  K).a] 

Note  that  the  generalization  from  Section  3.1  would  not  increase  the  expressiveness  of  transparent 
fixed-points.  Specifically,  any  references  to  Fst  s  in  the  signature  of  the  body  of  the  fixed-point 
could  be  replaced  by  a  anyway,  since  both  Fst  s  and  a  have  kind  S(c  :  k)  and  thus  are  equivalent. 

3.4  Fundamental  Property  of  Recursive  Modules 

Now  that  we  have  defined  two  clearly  differentiated  recursive  module  idioms,  it  is  interesting  to 
examine  what  they  have  in  common. 

In  their  treatment  of  practical  issues  regarding  recursive  modules,  CHP  suggest  an  “appealing 
typechecking  strategy”  for  transparent  recursive  modules  that  basically  works  as  follows.  Assume 
the  module  has  the  form  fix{s:S.M),  where  S  =  ps’^.Ss,  the  notation  Sb  signifying  the  body  of 
S.  Instead  of  checking  that  M  has  signature  S  as  prescribed  by  the  fixed-point  typechecking  rule, 
check  that  M  has  signature  SslFst  s / s^],  in  a  context  where  s  has  signature  S.  In  other  words. 
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check  that  the  body  of  the  fixed-point  can  be  ascribed  the  body  of  the  rds,  under  the  assumption 
that  the  recursive  module  variable  can  be  ascribed  the  rds.  CHP  give  a  simple  proof  that  this 
typechecking  strategy  is  sound  and  complete  with  respect  to  the  fixed-point  typechecking  rule. 

This  approach  is  appealing  because  it  provides  a  more  intuitive  and  practical  way  of  defining 
and  implementing  fixed-point  module  typechecking  than  that  supplied  by  the  type  system  directly. 
In  ordinary  fixed-point  typechecking,  one  would  have  two  options  for  checking  that  the  body  M 
of  a  fixed-point  has  rds  ps'^-Ss-  One  could  use  the  rds  roll  rule  to  reduce  this  to  checking  that  M 
has  Sb  [Fst  M/s'^],  but  this  involves  a  substitution  of  Fst  M  when  it  is  not  even  known  that  M  is 
well-formed.  The  substitution  of  Fst  s  for  s'^  in  the  new  strategy  is  much  more  straightforward  to 
implement.  Alternatively,  one  could  compute  the  principal  signature  of  M,  phase-split  it  and  check 
if  it  is  a  subtype  of  the  rds.  While  this  is  feasible  to  implement,  the  new  approach  is  far  more 
elegant  because  it  can  be  described  at  a  high  level  above  the  “implementation”  level  of  phase-split 
forms. 

What  we  find  particularly  appealing  about  this  approach  is  that  it  turns  out  to  apply  to  opaque 
recursive  modules  as  well,  albeit  in  a  slightly  weaker  form.  This  leads  us  to  believe  that,  in  addition 
to  being  a  guide  to  implementation,  this  typechecking  strategy  is  fundamental  to  recursive  module 
systems,  at  least  of  the  kind  we  have  studied.  Here  we  formalize  it  for  arbitrary  fixed-point  and 
rds  constructs  FIX  and  R: 

Definition  1  (Fundamental  Property  of  Recursive  Modules) 

A  fixed-point  module  construct  FIX{s:S.M)  and  an  rds  construct  Rs'^.S  form  a  recursive  module 
system  if  they  satisfy  the  following  property: 

•  T  h  FIX{s:S.M)  :  S  if  and  only  if 

such  that  T  h  5  =  Rs'^.Sb  sig  and  r[s  I  5]  h  M  J,  SB[Fst  s/s'^]. 

The  rule  described  above  was  stronger  than  the  one  given  here  in  that  the  equivalence  of  S  and 
Rs'^.Sb  was  an  assumption,  whereas  here  it  appears  inside  the  “iff”.  The  reason  for  this  weakening 
is  that,  for  opaque  modules,  there  exists  an  rds  equivalent  to  S  that  models  the  signature  of  the 
module  body  correctly,  but  not  every  equivalent  rds  does  so.  This  makes  this  strategy  arguably  less 
practical  than  the  first  one,  but  more  useful  in  theory  as  a  shared  high-level  property  of  recursive 
modules. 

We  now  proceed  to  show  that  both  the  opaque  and  transparent  idioms  obey  this  fundamental 
property.  The  proof  requires  us  to  be  able  to  perform  inversion  on  the  typechecking  rules  for  fixQ 
and  fixrp^  which  we  cannot  do  with  the  rds  roll  and  unroll  rules  from  Section  2.4  built  into  the 
type  system.  Luckily,  the  roll  and  unroll  rules  are  not  essential  (for  either  opaque  or  transparent 
rds’s)  and  can  be  shown  to  be  admissible  in  the  type  system  without  them.  The  proofs  are  fairly 
straightforward.  It  is  worth  noting  that  the  admissibility  proof  for  opaque  rds’s  makes  use  of  the 
“self”  rule  to  reify  the  static  part  of  the  opaque  signature;  and  that  the  proof  for  transparent  rds’s 
makes  use  of  equi-recursive  unrolling  for  admissibility  of  unroll  and  bisimilarity  for  admissibility  of 
roll. 

Lemma  2  (Admissibility  of  Roll  and  Unroll  for  Opaque  Rds’s) 

Suppose  T  h  ps^.S  sig.  Then,  T  h  M  :  ps'^.S  if  and  only  if  T  \-  M  :  S[Fst  M/s^^]. 

Proof:  Suppose  T  h  ps'^.S  sig.  Then,  by  well-formedness  of  opaque  rds’s,  we  know  that  r[s'^  : 
/t]  h  5  =  [a:K.a\  sig,  and  T  h  ps'^.S  =  [a:K.a[a/ s'^]]  sig. 

:  Suppose  T  h  M  :  ps^.S.  Then,  T  h  M  :  [a:K.a[a/ s'^]].  By  selfification,  T  h  M  :  [a:5{FstM  : 
K).a[a/ s"^]].  This  signature  is  equivalent  to  [a:5{Fst  M  :  n) .a[Fst  M / s"^]]  <  [a\K.a[Fst  M / s"^]]  = 
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S[FstM/s^]. 

<^=  :  Suppose  F  h  M  :  S[Fst  M/s"^].  Then,  F  h  M  :  [a:K,.a[Fst  M/s‘^]].  By  selfification, 
F  h  M  :  [a:5{Fst  M  :  K).a[Fst  M/ s"^]].  This  signature  is  equivalent  to  [a:5{Fst  M  :  K).a[a/s^]]  < 
[a-.n.a[a/ s"^]]  =  ps’^.S.  ■ 


Lemma  3  (Admissibility  of  Roll  and  Unroll  for  Transparent  Rds’s) 

Suppose  F  h  ps^.S  sig.  Then,  F  h  M  :  ps'^.S  if  and  only  if  T  \-  M  :  S[Fst  M/s^^]. 

Proof:  Suppose  F  h  ps'^.S  sig.  Then,  by  well-formedness  of  transparent  rds’s,  we  know  that 
F[s‘^  :  k]  h  5  =  [a:5(c  :  K).a]  sig,  and  F  h  ps'^.S  =  [a:5{p=s'^:K.c  :  K).a[a/s'^\\  sig,  where 
F[s"  T  k]  ^  C  i  K. 

:  Suppose  F  h  M  :  ps'^.S.  Then,  F  h  M  :  [a-.B{p=s’^:K.c  :  k).it[q;/s‘’]].  Since  F  h  Fst  M  = 
P=s'^:k.c  :  k,  M’s  signature  is  equivalent  to  [a:5{p=s'^:K.c  :  K).a[Fst  M/ .s’^]],  which,  by  equi-recursive 
unrolling,  is  equivalent  to  [a:5{c[Fst  M / s’^]  :  K).a[Fst  M/s"^]]  =  S[Fst  M/s'^]. 

<i=  :  Suppose  F  h  M  :  S[Fst  M/s'^].  Then,  F  h  M  :  [a:5{c[Fst  M/s’^]  :  K).a[Fst  M/s^]].  Since 
F  h  Fst  M  =  c[FstM/s’^]  :  k,  bisimilarity  implies  that  FstM  =  p=s'^:K.c,  so  M’s  signature  is 
equivalent  to  [a:5{p=s'^:K.c  :  K).a[a/s'^]]  =  ps'^.S.  ■ 


Theorem  4 

The  opaque  fixed-point  and  rds  constructs  form  a  recursive  module  system. 

Proof: 

:  Suppose  F  h  fixQ^S'.S.M)  :  S.  By  inversion  on  typechecking  for  fixQ,  F  h  5  =  [a'.K.ai]  sig, 
F[s  I  5]  h  M  I  [a\K.a2[Fst  s/ s"^]]  and  F[a  -.  n]  \-  ai  =  a2[a/s'^]  type.  Choosing  Sb  to  be 
[a:K.a2],  it  is  clear  that  F[s  t  >5']  b  M  |  SsiFst  s/.s’^].  By  phase-splitting  for  p,  we  also  have 
F  h  ps'^.Sb  =  [a:K.a2[oi/ s'^\\  =  [am.ai]  =  S  sig. 

<i=  :  Suppose  F  h  5  =  ps'^.Sb  sig  and  F[s  |  S']  h  M  J,  SB[Fst  s/ s^].  Well-formedness  of  ps'^.Sb 
requires  that  F[s'’  :  k]  h  Sb  =  [a-.K.a2[s'^ / Fst  s\\  sig,  where  F  h  k  kind  (i.e.  Sb  contains  only 
dynamic-on-static  recursive  references)  and  s'^  is  not  free  in  02.  Choose  ui  to  be  a2[oi/ Fst  s\. 
Then,  by  phase-splitting  for  p,  we  have  that  F  h  S  =  [a:K.a2[oi/ Fst  s\]  =  [a:n.ai]  sig.  Since 
SB[Fsts/s'^]  =  [a:K.cJ2],  the  initial  assumption  gives  us  F[s  t  -S']  b  M  J,  [a:K.a2].  We  have  thus 
fulfilled  the  premises  of  the  typechecking  rule  for  fixQ,  so  F  b  fixQ{s:S.M)  :  S.  ■ 

Theorem  5 

The  transparent  fixed-point  and  rds  constructs  form  a  recursive  module  system. 

Proof: 

:  Suppose  F  b  fixrp{s:S.M)  :  S.  By  inversion  on  typechecking  for  fixrp,  F  b  S  =  [a:S(c  :  K).a]  sig 
and  F[s  j  S]  b  M  J,  S.  Choose  Sb  to  be  S.  Since  0  FV(S),  F  b  SB[Fsts/s’^]  =  S  sig,  and 
so  F[s  t  S]  b  M  I  Sb  [Fst  s/s'’’] .  Also,  because  s'”  0  FV(c)  U  FV(cr)  and  by  phase-splitting  for  p, 
F  b  ps'^.Sb  =  [a:5{p=s'^:K.c  :  K).a[a/s'^]]  =  [a:S(c  :  K).a]  =  S  sig. 

:  Suppose  F  b  S  =  ps'^.Sb  sig  and  F[s  j  S]  b  M  J,  Sb  [Fst  s/s'”] .  Well-formedness  and  phase¬ 
splitting  for  p  tell  us  that  F  b  Sb  =  [a:S(c  :  k).it]  sig  and  F  b  S  =  [a:5{p,=s'^:K.c  :  K).a[a/s'^]]  sig. 
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(Note  of  course  that  S  is  transparent,  a  requirement  of  the  fixrp  typechecking  rule.)  By  equational 
reasoning  with  singleton  kinds,  r[s  T  -S']  h  Fst  s  =  ij,=  s'^:k.c  :  k.  Therefore,  r[s  T  5']  h  SsiFst  s/s'^]  = 
[a:5{c[iJ,=s^:K.c/ s'^]  :  K).a[iJ,=s^:K.c/ s'^]]  =  [a:5{fi=s^:K.c  :  K).a[a/s'^]]  =  ps^-Ss  =  S  sig.  Combining 
this  with  the  initial  assumption  yields  r[s  |  S']  h  M  J,  5.  We  may  now  apply  the  typechecking  rule 
for  fixrp  to  obtain  T  h  fixrp(s:S.M)  :  S.  ■ 


3.5  Theoretical  Conclusions 

What  has  our  theoretical  analysis  achieved?  First  of  all,  we  have  expanded  the  scope  of  the  opaque 
recursive  module  idiom  by  generalizing  the  fixed-point  construct  and  defining  an  opaque  form  of 
rds,  so  that  the  examples  from  CHP  do  not  require  the  use  of  transparent  modules.  Second,  we 
have  isolated  the  uses  of  equi-recursiveness  in  the  two  idioms  by  demonstrating  that  opaque  rds’s 
and  transparent  fixed-points  do  not  involve  recursion  on  type  constructors.  We  can  therefore  freely 
employ  the  latter  two  constructs  in  a  recursive  module  design  extension  for  the  external  language. 

However,  we  cannot  simply  discard  the  other  two  equi-recursive  constructs  as  impractical  since 
they  are  the  only  constructs  that  encode  module-recursive  constructors  at  all!  Enabling  the  defini¬ 
tion  of  mutually  recursive  types  in  separate  modules  is  one  of  the  main  motivations  behind  recursive 
modules,  so  a  recursive  module  design  that  does  not  handle  module-recursive  types  is  highly  unde¬ 
sirable.  This  points  to  a  weakness  in  our  theory,  namely  that,  for  simplicity,  we  have  conflated  the 
notions  of  recursive  and  equi-recursive. 

If  the  choice  between  opaque  and  transparent  programming  were  to  be  made  based  on  the 
analysis  thus  far,  the  transparent  idiom  would  still  be  preferable  because,  while  both  require  equi- 
recursive  constructors,  transparent  recursive  modules  are  strictly  more  expressive  than  opaque 
recursive  modules.  That  is,  every  opaque  module  is,  by  phase-splitting,  equivalent  to  the  trans¬ 
parent  module  formed  by  reifying  all  the  abstract  type  constructors  in  the  opaque  rds  with  their 
module-recursive  definitions  in  the  body  of  the  fixed-point.  Intuitively,  the  transparent  idiom  is 
more  expressive  because  it  propagates  more  type  equations  during  typechecking.  (It  is  worth  not¬ 
ing  that  the  reverse  is  true  for  the  fixed-point  constructs  per  se,  in  the  sense  that  transparent 
fixed-points  are  a  special  case  of  opaque  fixed-points.) 

Why  bother,  then,  with  the  opaque  idiom?  Because  the  power  of  the  opaque  idiom,  strong 
enough  to  handle  the  examples  from  CHP,  relies  less  on  equi-recursiveness  than  that  of  the  trans¬ 
parent  idiom.  In  the  next  section,  we  will  explore  if  and  how  iso-recursive  constructors  can  be 
used  to  replace  equi-recursive  constructors.  In  the  process,  we  will  see  that  the  restricted  use  of 
equi-recursiveness  in  opaque  programming  can  be  an  advantage  when  the  goal  is  eliminating  equi- 
recursiveness.  Thus,  understanding  why  the  opaque  idiom  was  able  to  encode  both  examples  from 
CHP,  despite  the  fact  that  it  is  generally  less  expressive  than  the  transparent  idiom,  is  a  fruitful 
starting  point  for  our  practical  analysis. 


4  Recursive  Modules  in  Practice 

We  will  now  take  the  theoretical  constructs  studied  in  the  previous  section  and  attempt  to  develop  a 
recursive  module  extension  for  Standard  ML  based  on  them.  The  key  criteria  for  such  an  extension 
are  how  practical  it  is  and  how  well  it  generalizes  recursive  constructs  at  the  ML  core  level  (i.e. 
datatypes  and  recursive  functions).  Not  surprisingly,  our  end  result  is  a  proposal  that  combines 
aspects  of  both  opaque  and  transparent  programming.  Before  arriving  there,  however,  we  will  look 
at  each  fixed-point  and  rds  construct  and  judge  its  usefulness  by  the  abovementioned  criteria. 
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4.1  The  Practical  Uses  of  Fixed-Point  Modules 


We  begin  the  practical  comparison  of  the  two  recursive  module  idioms  with  the  question  that  ended 
the  previous  section:  If  transparent  programming  is  more  expressive  than  opaque  programming, 
why  did  the  examples  from  CHP  work  fine  in  the  opaque  idiom?  The  answer  is  very  simple:  both 
the  List  and  ExpDec  examples  were  essentially  implementations  of  ML-style  datatypes. 

The  datatype  mechanism,  one  of  ML’s  strongest  successes,  provides  the  only  way  of  defining 
(mutually)  recursive  types  in  the  language,  specifically  recursive  (tuples  of)  labeled  sums.  The 
labels  corresponding  to  a  datatype  serve  as  constructors  of  the  type,  as  well  as  destructors  when 
used  in  pattern  matching  against  elements  of  the  type.  In  order  to  make  the  relationship  between 
datatypes  and  opaque  programming  absolutely  clear,  it  is  first  necessary  to  define  what  we  mean 
by  “implementing  datatypes”.  In  Section  4.1.1,  we  provide  a  brief  review  of  Harper  and  Stone’s 
formalization  of  the  implementation  of  datatypes.  In  Section  4.1.2,  we  discuss  how  opaque  recursive 
modules  generalize  datatypes.  Then,  in  Section  4.1.3,  we  present  an  example  of  non-datatype  code 
in  an  opaque  module  that  suggests  opaque  modules  are  better  suited  to  implementing  datatypes 
than  to  general  programming.  Finally,  in  Section  4.1.4,  we  look  at  how  transparent  fixed-points 
generalize  ML’s  recursive  function  definitions. 

4.1.1  The  Harper-Stone  Interpretation  of  Datatypes 

Harper  and  Stone  have  given  an  interpretation  of  Standard  ML  into  an  underlying  type  theory 
with  modules  [10],  which  serves  as  the  basis  of  the  front-end  of  the  TILT  compiler  for  Standard 
ML  in  development  at  Carnegie  Mellon  University.  This  interpretation  takes  the  form  of  a  set  of 
elaboration  rules  that  involve  typechecking  of  the  ML  source  program  as  well  as  the  “desugaring”  of 
ML  features  such  as  pattern  matching  and  signature  patching.  Perhaps  the  most  infamous  of  these 
elaboration  rules,  for  its  length  and  complexity,  is  the  one  for  handling  datatype  declarations. 

What  datatype  elaboration  essentially  does  is  to  translate  a  datatype  into  a  module  containing 
a  recursive  sum  type  along  with  a  constructor  function  for  each  branch  of  the  sum  and  a  single 
destructor  function  (labeled  expose)  returning  a  non-recursive  sum.  Suppose  for  the  moment  that 
Harper-Stone  used  equi-recursive  types.  Then,  the  constructors  would  merely  be  sum  injections 
and  the  destructor  a  no-op.  For  example,  an  integer  list  type  written  in  ML  as 

datatype  intlist  =  nil  |  cons  of  int  *  intlist 

could  be  elaborated  to 

struct 

type  intlist  =  ^=a.  1  -|-  int  x  a 
val  nil  :  intlist  =  ini  () 

fun  cons  (x:int,  L: intlist)  :  intlist  =  inr  (x,L) 
fun  expose  (L: intlist)  :  1  +  int  *  intlist  =  L 
end 

Voila!  This  is  the  phase-split  form  of  the  List  example.  One  could  similarly  write  ExpDec  as  two 
mutually  recursive  datatype  declarations  (connected  with  an  and),  and  it  would  elaborate  to  the 
phase-split  form  of  the  recursive  ExpDec  module.  This  is  what  we  mean  by  saying  that  the  CHP 
examples  are  implementations  of  ML  datatypes. 

In  practice,  however,  equi-recursiveness  is  not  necessary  here.  An  iso-recursive  constructor 
(denoted  //^)  can  be  used,  with  rolls  and  unrolls  added  to  the  constructor  and  destructor  definitions, 
respectively,  in  order  to  mediate  between  the  recursive  sum  type  and  its  unrolling.  (More  specifics 


17 


about  iso-recursive  types  and  their  typing  rules  will  be  given  in  Section  4.2.1.)  Thus,  Harper-Stone 
actually  elaborates  datatype  intlist  to  the  following: 

struct 

type  intlist  =  1  -|-  int  x  a 

val  nil  :  intlist  =  rollintiist  (ini  ()) 

fun  cons  (x:int,  L:intlist)  :  intlist  =  rollintiist (inr  (x,L)) 
fun  expose  (L: intlist)  :  1  +  int  *  intlist  =  unrollintiist  (L) 
end 

Note  that  we  are  simplifying  here  by  ignoring  the  details  of  equality  compilation  (i.e.  the  generation 
of  recursive  equality  functions  for  datatypes)  and  polymorphic  datatypes,  because  they  are  basically 
orthogonal  to  the  discussion.  In  addition,  there  is  some  controversy  over  whether  or  not  the 
module  resulting  from  datatype  elaboration  should  be  sealed  with  an  opaque  signature  hiding  the 
implementation  of  the  datatype,  as  is  prescribed  in  Harper-Stone.  We  will  return  to  that  issue  in 
Section  4.3. 

4.1.2  Generalization  of  ML  Datatypes 

There  are  two  main  ways  in  which  opaque  recursive  modules  generalize  ML  datatypes.  First,  the 
ExpDec  example  exhibits  how  recursive  modules  permit  mutually  recursive  type  definitions  across 
module  boundaries,  so  long  as  the  type  definitions  occur  under  the  roof  of  a  single  opaque  fixed- 
point  and  refer  to  each  other  through  it.  We  have  discussed  already  that  this  was  one  of  the  primary 
motivations  for  a  recursive  module  extension  in  the  first  place. 

Second,  opaque  recursive  modules  generalize  and  simplify  ML’s  withtype  mechanism.  In  ML, 
withtype  provides  a  way  of  writing  ordinary  type  definitions  in  a  mutually  recursive  bundle  with 
datatype  definitions.  In  order  to  avoid  the  need  for  equi-recursive  types,  the  ordinary  type  def¬ 
initions  may  only  refer  to  the  datatypes,  not  one  another.  The  Definition  of  Standard  ML  [16] 
specifies  that  each  occurrence  of  one  of  the  withtype  identifiers  in  the  datatype  definitions  should 
be  macro-expanded  to  the  type  bound  to  the  identifier.  This  is  viewed  by  some  as  a  rather  ugly 
style  of  definition,  if  a  necessary  one. 

In  an  opaque  recursive  module,  the  withtype  definitions  are  unnecessary  and  can  be  replaced 
by  ordinary  type  definitions.  In  order  to  do  this,  we  simply  need  to  make  the  withtype  definitions 
transparent  in  the  rds,  which  is  feasible  because  they  are  not  module-recursive.  Consider  a  simple, 
if  admittedly  useless,  reworking  of  the  List  example,  where  the  int  *  List.t  is  extracted  into  a 
withtype  definition.  Here  is  the  ML  version: 

datatype  list  =  nil  |  cons  of  headtail 
withtype  headtail  =  int  *  list 

ML  macro-expands  this  to: 

datatype  list  =  nil  |  cons  of  int  *  list 
type  headtail  =  int  *  list 

The  opaque  recursive  module  can  skip  the  macro-expansion: 

signature  LIST  = 
sig 

type  list 
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type  headtail  =  int  *  list 
val  nil  :  list 
val  cons  :  headtail  ->  list 
val  expose  :  t  ->  1  +  headtail 
end 

structure  List  =  ofix  (List  :  LIST, 
type  list  =  1  +  List .headtail 
type  headtail  =  int  *  list 


) 


Having  shown  that  opaque  recursive  modules  are  closely  tied  to  ML  datatypes  and  generalize 
them  in  interesting  ways,  it  remains  for  us  to  show  that  they  can  be  implemented  without  the  use 
of  equi-recursive  constructors.  That  issue  warrants  more  extended  discussion  and  is  taken  up  in 
Section  4.2. 

4.1.3  The  Limitations  of  Opacity 

Suppose  we  want  to  extend  our  trusty  List  example  with  a  simple  function  called  nthtail  that 
takes  a  list  L  and  an  integer  n  and  returns  the  result  of  removing  the  first  n  elements  from  the 
list.  Writing  this  function  inside  the  body  of  the  List  module  gives  the  programmer  access  to  the 
module-recursive  sum  implementation  of  the  datatype  t.  Here  is  a  naive  attempt  to  make  use  of 
the  access  to  that  implementation: 

fun  nthtail  (L  :  t,  n  :  int)  :  t  = 
if  n  =  0  then  L 
else  case  L  of 

ini  0  =>  raise  Error 

I  inr  (hd  :  int,  tl  :  List.t)  =>  nthtail (tl,  n-1) 

This  fails  to  typecheck  because  the  type  of  tl  does  not  match  the  argument  type  of  nthtail  in 
the  recursive  call.  There  are  two  ways  to  fix  this.  We  could  either  make  the  type  of  argument  L 
be  List  .t  instead  of  t,  or  add  nthtail  to  the  rds  LIST  and  replace  the  recursive  call  to  nthtail 
with  a  call  to  List  .nthtail.  Either  way,  both  the  argument  type  and  result  type  of  nthtail  will 
become  List.t  because  the  argument  L  is  also  the  result  in  the  case  that  n  =  0. 

This  in  turn  causes  a  problem  because  the  case  expression  requires  L  to  be  of  sum  type,  which 
List.t  is  not  since  it  is  abstract.  The  simple  way  to  solve  this  is  to  make  a  module-recursive  call 
to  List. expose  before  case  analysis.  The  new  nthtail  implementation  is  as  follows: 

fun  nthtail  (L  :  List.t,  n  :  int)  :  List.t  = 
if  n  =  0  then  L 
else  case  List . expose (L)  of 
ini  0  =>  raise  Error 

I  inr  (hd  :  int,  tl  :  List.t)  =>  nthtail (tl,  n-1) 

This  example  illustrates  that  “programming  inside  the  datatype”  is  not  only  of  limited  utility, 
but  sometimes  impossible.  Attempts  to  exploit  access  to  the  datatype  implementation  result  in 
confusing  and  unintuitive  typechecking  errors.  The  solution  we  have  arrived  at  for  nthtail  is 
really  an  instance  of  “programming  outside  the  datatype”  because  it  treats  the  datatype  completely 
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abstractly  and  could  have  just  as  easily  been  written  outside  the  module.  This  does  not  imply  that 
there  is  anything  wrong  with  writing  code  like  nthtail  inside  the  List  module.  In  fact,  in  a 
recursive  module  design  for  ML,  we  will  definitely  want  to  be  able  to  write  such  functions  in  the 
same  modules  as  the  datatypes  they  traverse.  The  point  is  simply  that  there  is  no  expressiveness  to 
be  gained  from  having  access  to  the  datatype  implementation,  and  it  does  not  make  sense  to  thrust 
the  “opaque”  distinction  between  t  and  List.t  on  the  ML  programmer  if  it  is  not  necessary.  We 
thus  turn  to  transparent  fixed-points  now  to  see  how  they  generalize  recursive  programming  in  the 
absence  of  the  opaque  distinction. 

4.1.4  Transparent  Fixed-Points  and  Polymorphic  Recursion 

For  transparent  fixed-point  modules,  practicality  is  not  an  issue  since  they  do  not  involve  any 
recursion  on  types  at  all.  The  most  obvious  way  that  transparent  fixed-points  generalize  ML’s 
recursive  function  definitions  is  that  the  fixed-point  allows  mutually  recursive  functions  to  be  defined 
in  separate  submodules.  In  addition,  perhaps  less  immediately  clear  is  that  the  fixed-point  module 
construct  provides  a  very  systematic  way  of  expressing  polymorphic  recursion. 

“Polymorphic  recursion”  refers  to  the  ability  of  a  polymorphic  function  to  call  itself  recur¬ 
sively  at  different  instantiations  of  its  polymorphic  type.  This  is  particularly  useful  when  writing 
a  function  that  traverses  a  non-uniform  datatype.  Analogously,  non-uniform  datatypes  allow  a 
polymorphic  datatype  definition  to  refer  to  itself  recursively  at  different  instantiations  of  its  type 
argument.  Here  is  a  simple  contrived  example  to  illustrate  the  two  together: 

datatype  ’at=A  |  Bof  ’a*  ’att 

(*  val  collect  :  ’a  t  ->  ’a  list 

Collects  all  the  data  of  type  ’a  in  t  and  forms  a  list.  *) 
fun  collect  (A)  =  [] 

I  collect  (B(x,FF))  =  (*  x  :  ’a,  FF  :  ’a  *  'a  t  t  *) 

let  val  FL  =  collect  FF  (*  collect  needs  type  ’a  t  t  ->  ’at  list  *) 

val  LL  =  map  collect  FL  (*  collect  needs  type  ’at  ->  ’a  list  *) 
val  L  =  foldl  (op  @)  []  LL 
in  x: :L  end 

The  difference  between  non-uniform  and  uniform  datatypes  (and  between  polymorphic  and 
monomorphic  recursion)  is  simply  a  matter  of  where  the  constructor-level  (or  term-level)  type 
abstraction  occurs  —  inside  or  outside  the  fixed-point.  Non-uniform  datatypes  and  polymorphi- 
cally  recursive  functions  are  fixed-points  at  function  kind  and  function  type,  respectively,  the  type 
lambda  occurring  underneath  the  fixed-point  construct.  Oddly,  Standard  ML  admits  non-uniform 
datatypes,  but  not  polymorphic  recursion,  the  reason  being  that  the  latter  causes  type  inference 
to  be  undecidable.  The  issue  of  type  inference  in  the  presence  of  polymorphic  recursion,  as  well 
as  any  debate  over  the  merits  of  polymorphic  recursion,  is  outside  the  scope  of  this  paper.  (See 
[17,  11,  12,  18]  for  more  on  the  subject.) 

By  forcing  the  types  of  the  functions  in  the  module  body  to  be  specified  in  the  rds,  transparent 
fixed-points  enable  the  coexistence  of  monomorphic  and  polymorphic  recursion  without  affecting 
type  inference  at  all.  While  ordinary  recursive  calls  remain  monomorphic,  module-recursive  calls 
are  free  to  be  polymorphic.  So,  for  instance,  in  the  example  above,  the  collect  function  could  be 
expressed  in  a  transparent  fixed-point  as  follows: 

tfix  (Mod  :  sig  val  collect  :  ’a  t  ->  ’a  list  end. 
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fun  collect  (A)  =  [] 

I  collect  (B(x,FF))  = 

let  val  FL  =  Mod. collect  FF  (*  polymorphic  recursive  call  *) 
val  LL  =  map  collect  FL  (*  monomorphic  recursive  call  *) 
val  L  =  foldl  (op  @)  []  LL 
in  x: :L  end 

) 

The  reason  transparent  fixed-points  have  no  trouble  expressing  polymorphic  recursion  is  that  they 
are  implemented  with  the  term-level  fix  construct,  which  is  defined  to  work  over  an  arbitrary  type. 
In  a  sense,  relying  on  the  explicitness  of  the  rds  in  a  transparent  fixed-point  could  be  viewed  as 
a  non-solution  to  the  type  inference  problem  for  polymorphic  recursion.  On  the  other  hand,  it 
follows  the  “pay  as  you  go”  principle  of  language  design  —  polymorphic  recursion  is  encapsulated 
in  the  transparent  fixed-point  construct,  so  ML  code  that  does  not  need  polymorphic  recursion 
may  continue  to  employ  type  inference. 

4.2  Implementing  Generalized  Datatypes 

Deeming  opaque  recursive  modules  a  useful  generalization  of  ML  datatypes,  we  would  like  to  make 
them  practical  by  replacing  the  use  of  equi-recursive  constructors  with  iso-recursive  constructors. 
First,  in  Section  4.2.1,  we  will  discuss  how  iso-recursive  constructors  work.  Then,  in  Sections  4.2.2 
and  4.2.3,  we  will  explore  a  technique  for  implementing  datatype-generalizing  opaque  fixed-points 
using  iso-recursive  constructors. 

4.2.1  Iso-Recursive  Constructors 

Despite  their  practicality,  iso-recursive  constructors  have  not  been  studied  as  thoroughly  as  equi- 
recursive  constructors  in  the  type  theory  literature  (perhaps  because  they  are  not  real  fixed-points), 
and  so  there  are  fewer  standard  formalisms  to  go  by^.  To  ground  our  presentation  in  reality,  we  will 
begin  with  the  limited  form  of  iso-recursive  constructor  that  appears  in  the  Middle  Intermediate 
Language  (MIL)  of  the  TILT  compiler  [19],  and  then  generalize  it  to  suit  our  purposes.  We  believe 
that  our  generalized  iso-recursive  constructors  preserve  the  decidability  of  the  MIL,  but  a  proof 
will  not  be  attempted  here. 

In  the  MIL  theory,  the  iso-recursive  constructor  is  defined  only  at  the  kind  of  a  pair  of  types  (i.e. 
TxT,  which  is  easily  generalized  in  practice  to  n-tuple  kinds)  and  takes  the  form  /3).(ci,  C2). 

The  typing  rule  is  the  same  as  for  equi-recursive  constructors,  except  that  there  is  no  contractiveness 
condition.  Modulo  /dry-equivalence,  an  iso-recursive  constructor  can  only  be  equivalent  to  another 
iso-recursive  constructor,  on  the  condition  that  their  bodies  are  equivalent. 

r[a  :  T][P  :  T]  h  Cl  :  T  r[a  :  T][f3  :  T]  h  C2  :  T 
T  h  n^{a,  P).{ci,C2)  :  TxT 

r[a  :  T][P  :  T]  h  Cl  =  C3  :  T  r[a  :  T][f3  :  T]  h  C2  =  C4  :  T 
T  h  /is(a,/3).(ci,C2)  =  ^s(q!,/3).(c3,C4)  :  TxT 

One  may  construct  an  element  of  type  ■  •).!  or  //^(-  •  •).2  with  a  roll  expression  and  deconstruct 
one  with  an  unroll  expression.  The  typing  rules  are  straightforward: 

T  h  c  :  T  T  h  c  =  ;U^(q;, /3).(ci,  C2).f  :  T  T  h  e  :  T(ci[c.l,  c.2/a, /?]) 

T  h  rollc(e)  :  T(c) 

^The  authors  would  appreciate  pointers  to  any  references  on  type  theory  of  iso-recursive  constructors. 
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r  h  e  :  T(c)  T  \- c  =  l3).{ci,C2)-i '■  T 

r  H  uiirollc(e)  :  r(ci[//^(a, /3).(ci,  02).!,  /3).(ci,  C2).2/a, /?]) 


The  MIL  formalism  is  sufficient  for  expressing  ML  datatypes,  but  to  replace  equi-recursive 
constructors,  we  need  iso-recursive  constructors  at  arbitrary  higher  kind.  The  generalized  construct 
can  take  the  form  ^^a:K.c,  and  the  corresponding  typing  and  equivalence  rules  are  easy: 

T  h  K  kind  T[a  ■.  k]\-  c  ■.  k 
T  h  n^a-.K.c  :  k 

T  \-  Ki  =  K2  kind  r[a  :  ki]  h  ci  =  C2  : 

T  h  ^^a-.Ki-Ci  =  fj,^a:K2-C2  :  ki 

The  interesting  question  is  how  to  generalize  the  typing  rules  for  roll  and  unroll.  The  problem 
is  that  rolls  and  unrolls  only  mediate  between  constructors  of  kind  type  (obviously,  since  they  are 
term-level  constructs).  In  the  case  of  the  MIL,  the  pair  kind  limited  the  c  in  rollc(e)  to  be  either 
a  first  or  second  projection  of  a  In  the  general  case,  we  propose  to  allow  c  to  be  any  series  (or 
“path”)  of  destructions  —  in  our  calculus,  either  projections  from  a  pair  or  function  applications 
—  starting  with  a  and  resulting  in  a  constructor  of  kind  T.  Formally,  we  define  these  paths  of 
destruction  P  as  follows: 

P  ::=  •  I  Pc  I  PI  I  P.2 

The  •  signifies  the  “head”  of  the  path  where  the  will  be  substituted,  and  the  notation  P{c} 
represents  P  with  c  as  the  head.  The  syntax  for  paths  does  not  enforce  either  that  the  head  is  a 
or  that  that  the  path  has  kind  T,  but  the  generalized  typing  rules  for  roll  and  unroll  do: 

T  h  c  =  P{fi^a:Kp}  :T  T  ^  c"  =  P{c'[^^^a■.K.d / a]]  :T  The:  T{c'') 

T  I-  rollc(e)  :  T(c) 

T  h  c  =  P{^sa:K.c'}  :  T  T  h  c"  =  P{d[n^a:K.c' / a]]  :T  The:  T(c) 

T  h  unrollc(e)  :  T{c") 

In  practice,  the  unrolled  constructors  c”  can  be  obtained  by  /^-normalizing  P{c'[iJ,^a:K.c' /a]}. 

4.2.2  The  Coercion  Calculus 

Recall  the  phase-splitting  interpretation  of  opaque  fixed-points  detailed  in  Section  3.1.  The  dynamic 
part  of  the  module  body,  e,  has  type  it[(c[//=s‘^:k.c/s'^])/q;],  and  the  dynamic  part  of  the  resulting 
phase-split  structure  has  type  a[^=s'^:K.c/ a].  Equi-recursiveness  makes  the  constructors  being 
substituted  for  a  here  equal,  so  e’s  actual  type  matches  the  required  type  of  the  phase-split  result 
and  the  phase-splitting  rule  was  able  to  just  wrap  e  with  a  term-level  fixed-point.  Suppose  now 
that  we  replace  the  equi-recursive  constructor  with  an  iso-recursive  constructor.  The  types  are  no 
longer  equal,  so  we  need  a  way  to  coerce  e  from  its  actual  type  to  the  required  type,  preferably 
without  any  added  run-time  cost  for  the  coercion. 

The  approach  that  we  have  investigated  is  to  use  Crary’s  “coercion  calculus”  [3].  The  broader 
goal  of  the  coercion  calculus  is  to  eliminate  inclusive  subtyping  and  bounded  quantification  in  type- 
directed  compilers,  without  introducing  the  run-time  costs  of  coercions  inherent  in  the  well-known 
“Penn  interpretation”  of  Breazu-Tannen  et  al.  [2].  Crary  achieves  this  by  putting  the  subtyping 
coercion  functions  into  a  separate  syntactic  class  so  that  they  can  be  easily  discarded  before  run¬ 
time. 
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Crary’s  system  is  appealing  for  our  purposes  because  it  can  generate  static  coercions  between 
any  two  extensionally  equivalent  types.  (By  extensionally  equivalent,  we  mean  that  both  types 
are  inclusive  subtypes  of  each  other.)  The  coercion  calculus  is  limited,  however,  in  that  it  is  only 
defined  for  a  type  system  that  lacks  higher-order  type  constructors.  In  Section  4.2.3,  we  will  specify 
a  normal  form  in  which  generalized  datatype  modules  can  be  written  so  that  the  type  of  the  code 
in  the  module  body  and  the  type  of  the  code  in  the  phase-split  form  are  indeed  extensionally 
equivalent.  Fortunately,  for  this  restricted  class  of  normal  form  modules,  the  limitations  of  the 
calculus  can  be  overcome  by  replacing  its  simple  roll/unroll  coercions  with  our  more  general  ones 
that  handle  ^^’s  of  higher  kind.  We  now  give  some  idea  of  how  coercions  work. 

Coercions  q  are  like  witnesses  to  casts  from  a  subtype  to  a  supertype,  but  they  comprise  a 
separate  syntactic  class  from  term-level  functions  in  order  to  simplify  their  eventual  elimination: 

q  ::=  id  |  rollc  |  unrollc  |  ^  92  |  9i  x  <?2  |  +  92  |  ycc.K.q 

id  is  the  identity  coercion,  rollc  and  unrollc  are  as  we  have  defined  them.  The  rest  of  the  coercions 
correspond  to  arrow  (function),  product,  sum  and  universally  quantified  types,  respectively.  The 
typing  judgment  for  coercions  takes  the  form  T  h  9  :  iTi  ^  cJ2,  signifying  that  9  is  a  coercion  from 
(Ti  to  (T2  in  context  T.  The  inference  rules  are  shown  below: 

T  h  O'  type  T  h  c  =  P{p,^a:K.c'}  :  T  T  h  c"  =  P{c'[p,^a:K.(//a]}  :  T 
ri-id:iT^(T  TF  rollc  :  T{c")  ^  T{c) 

T  h  c  =  P{pLgia:K.c'}  :  T  T  h  c''  =  P{c'[iJ,^a:K.c' /a]}  :  T 
T  b  unrollc  :  T{c)  ^  T{c”) 

T  h  9i  :  (t(  ^  iTi  T  h  92  :  CJ2  ^  0-2  r[Q;  :  k]  h  9  :  ui  ^  CJ2 

T  h  9i  ^  92  :  cJi  ^  1T2  ^  ^  cr2  T  h  \/a:K.q  :  Va:«:.iTi  ^  Va:K.cJ2 

T  h  91  :  CTi  ^  cj(  T  h  92  :  (T2  ^  <72  T  h  91  :  ui  ^  cr'^  T  h  92  :  cr2  ^  (T2 
T  h  9i  X  92  :  (Tl  X  CJ2  ^  cr'l  X  £72  T  h  9i  -I-  92  :  CTl  -I-  £72  ^  £7(  -I-  £72 

Coercions  are  folded  into  our  core  calculus  by  adding  a  new  term-level  form,  qe,  that  applies 
coercion  9  to  expression  e.  The  typechecking  rule  for  coercion  applications  is  the  obvious  thing: 

T  h  9  :  £7i  ^  £72  T  h  e  :  £7i 
T  h  9  e  :  £72 

The  dynamic  semantics  of  coercion  applications  are  incorporated  into  the  operational  semantics 
of  the  core  calculus  through  a  separate  “canonicalization”  judgement,  which  takes  applications  of 
a  coercion  to  a  value,  qv,  and  translates  them  to  a  canonical  form.  This  organization  facilitates 
a  formalization  of  the  idea  that  coercion  applications  are  static.  Specifically,  when  types,  type 
abstractions  and  coercions  are  erased  at  code  generation  time,  a  coercion  application  qv  and  its 
canonical  form  —  in  essence,  the  result  of  evaluating  the  coercion  application  —  are  shown  to 
become  the  same  untyped  code.  See  Crary  [3]  for  more  details. 

Let’s  look  at  what  coercion  is  generated  for  the  List  example.  Let  s  =  Pint  x  a,  and 

let  t  =  X  s.  Here,  s  corresponds  to  List.t  and  t  corresponds  to  the  definition  of  type  t 

in  the  module  body.  The  code  in  the  module  body  (consisting  of  the  definitions  of  nil,  cons  and 
expose)  has  type 

t  X  {int  xs^t)x{t^l  +  int  x  s) 
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while  the  type  we  need  to  coerce  to  is 

s  X  {int  xs^s)x(s^l  +  int  x  s) 

Since  s  and  t  are  extensionally  equivalent,  the  coercion  judgment  will  generate  the  following  coercion 
to  be  applied  to  the  module  body: 

rolls  X  (id  ^  rolls)  x  (unrolls  ^  id) 

Note  here  how  arrow  subtyping  —  contravariant  in  the  argument  type,  covariant  in  the  result  type 
—  yields  a  rolls  in  the  result  of  cons  and  an  unrolls  in  the  argument  of  expose. 

It  is  interesting  to  observe  that  if  we  use  the  canonicalization  judgment  to  find  the  canonical 
form  of  the  application  of  the  above  coercion  to  the  List  module  body,  we  obtain  precisely  the 
Harper-Stone  interpretation  of  the  integer  list  datatype  given  in  Section  4.1.1.  For  instance,  what 
the  arrow  coercion  id  trolls  does,  when  applied  to  the  cons  function,  is  to  leave  the  argument 
alone  and  apply  a  rolls  to  the  result.  Likewise,  the  nil  value  gets  rolled,  and  the  argument  to 
the  expose  function  gets  unrolled  while  the  result  is  unchanged. 

4.2.3  Generalized  Datatype  Normal  Form 

Define  an  opaque  recursive  module  to  be  in  generalized  datatype  normal  form  if: 

1.  The  body  (and,  recursively,  its  substructures)  consists  of  only  two  kinds  of  things:  transparent 
and  abstract  type  declarations. 

2.  A  transparent  type  declaration  is  simply  a  type  constructor  definition,  which  is  restricted  to 
appear  transparent  in  the  rds  and  may  therefore  not  contain  any  references  to  the  recursive 
module  variable. 

3.  An  abstract  type  declaration  consists  of  a  type  constructor  definition,  which  is  restricted  to 
appear  abstract  in  the  rds,  of  the  form 

t  =  \cX‘.K.  I — >  Cl ,  *  *  *  ^  In  * — ^  Cn } 

where  the  S  notation  indicates  a  labeled  sum,  with  each  li  labeling  a  branch,  and  where 
the  Ci  (in  a  dual  requirement  to  the  transparent  types)  may  not  contain  any  references  to 
other  types  defined  in  the  body.  (Note:  Non-polymorphic  type  definitions  can  be  written 
by  using  k  =  1.)  Along  with  the  type  definition  are  n  constructor  functions  for  t,  the 
i-th  one  having  type  \/a:K.  Ci^t{a)  (or  just  ya:K.t{a)  if  c*  =  1),  and  a  destructor  of  type 
ya:K.  t{a)  S{/i  ci,  •  •  • ,  i-^  c^}.  The  constructors  are  implemented  as  simple  polymor¬ 

phic  injections  into  the  sum  and  the  destructor  as  the  polymorphic  identity.  Abstract  type 
declarations  will  also  be  referred  to  as  datatype  declarations,  for  the  obvious  reason. 

We  will  not  formalize  a  normalization  process  here,  but  it  is  not  hard  to  see  how  “generalized 
datatypes”  of  the  kind  discussed  in  Section  4.1.2  can  be  converted  to  this  normal  form.  Essentially, 
(non-uniform)  datatypes  defined  mutually  recursively  through  ML’s  datatype  mechanism  need  to 
be  separated  and  turned  into  plain  polymorphic  sums  by  making  all  mutually  recursive  references 
to  themselves  or  one  another  be  module-recursive.  All  the  types  defined  in  the  module  body  are 
accessible  in  the  signature  of  the  recursive  module  variable,  so  this  is  not  a  problem. 

It  can  also  be  shown  that  the  coercion  judgment  discussed  in  the  previous  section  will  generate 
coercions  for  normal  form  modules  that  are  very  similar  in  form  to  the  coercion  generated  for  the 
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List  example.  It  is  worthwhile  noting  that  the  restriction  on  the  form  of  datatype  declarations  — 
specifically,  the  full  module-recursiveness  of  the  Cj’s  —  is  important  for  ensuring  that  we  can  use 
the  identity  coercion  id  in  precisely  the  places  it  occurred  in  the  coercion  for  List,  that  is,  in  the 
argument  types  of  the  constructor  functions  and  the  result  type  of  the  destructor. 

4.3  Comparison  of  Rds  Constructs 

In  theory,  transparent  rds’s  are  more  expressive  than  opaque  rds’s  because  they  use  equi-recursive 
constructors  to  encode  transparent  module-recursive  type  definitions.  In  practice,  however,  the  need 
for  equi-recursiveness  is  an  impediment  to  the  success  of  transparent  rds’s,  whereas  the  simplicity  of 
opaque  rds’s  makes  them  readily  usable.  The  question  is  whether  the  static-on-static  dependencies 
that  transparent  rds’s  support  can  be  limited  in  a  way  that  does  not  require  equi-recursiveness 
but  still  gives  transparent  rds’s  the  expressive  edge.  In  our  estimation  the  answer  is  no,  but  the 
argument  for  or  against  transparent  rds’s  is  wholly  dependent  on  whether  one  follows  the  opaque 
or  transparent  interpretation  of  datatype  specifications  in  signatures.  (This  is  yet  another  way  in 
which  recursive  modules  and  datatypes  are  inextricably  linked.)  We  will  therefore  first  compare 
the  two  datatype  interpretations  before  arguing  for  the  superiority  of  opaque  rds’s. 

4.3.1  Transparent  and  Opaque  Interpretations  of  Datatypes 

In  the  Harper-Stone  interpretation  of  Standard  ML,  datatype  specifications  appearing  in  signatures 
are  elaborated  in  such  a  way  that  the  (recursive)  implementations  of  the  datatypes  are  held  abstract. 
This  is  termed  the  “opaque  interpretation”  of  datatypes.  The  problem  with  making  datatypes 
abstract  is  that  uses  of  the  datatype  constructors  and  destructor  cannot  be  safely  inlined,  and  a 
function  call  is  a  heavy  cost  to  pay  for  a  sum  injection  or  a  no-op  (rolls  and  unrolls  are  generally 
implemented  as  no-ops). 

One  solution,  incorporated  into  the  FLINT  compiler  for  Standard  ML  [23]  and  later  into  the 
TILT  compiler,  is  to  use  a  transparent  interpretation  of  datatypes,  exposing  the  iso-recursive  sum 
implementation.  While  this  solves  the  efficiency  problem  with  the  opaque  interpretation,  Crary 
et  al.  have  pointed  out  that  it  introduces  a  separate  problem,  namely  that  certain  kinds  of  valid 
Standard  ML  programs  fail  to  typecheck  [4].  For  example,  suppose  that  the  argument  signature  of 
a  functor  F  includes  the  specification  of  a  type  tl  and  a  datatype  t2  that  refers  to  it: 

functor  F  (X:  sig 

type  tl 

datatype  t2  =  C  |  D  of  int  *  tl 
end)  =  ... 

Then,  suppose  F  is  applied  to  a  pair  of  mutually  recursive  datatypes  tl  and  t2  defined  as  follows: 

datatype  tl  =  A  |  B  of  string  *  t2 

and  t2  =  C  I  D  of  int  *  tl 

In  the  opaque  interpretation,  the  functor  application  is  valid  because  t2  is  abstract  in  the  argument 
signature  of  F  and  the  constructors  and  destructor  of  t2  have  the  correct  types.  In  the  transparent 
interpretation,  the  argument  signature  specifies  that  t2  must  have  type  /r^a.l  +  int  x  tl.  However, 
the  actual  argument  types  tl  and  t2  are  defined  as  the  first  and  second  projections  from  an  iso¬ 
recursive  constructor  t  of  kind  T  xT:  t  =  /3).(1  -|-  string  x  (3,1  +  int  x  a).  Under  the  simple 
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equivalence  rule  for  iso-recursive  constructors  that  was  given  in  Section  4.2.1,  there  is  no  way  to 
equate  the  actual  and  required  definitions  for  t2,  so  the  functor  application  is  not  valid  under  the 
transparent  interpretation  of  datatypes. 

One  way  to  remedy  this  problem,  as  detailed  by  Crary  et  al.  [4],  is  to  use  what  they  refer 
to  as  Shao’s  equation  (after  its  original  implementation  by  Shao  in  the  FLINT  compiler).  For 
iso-recursive  types,  the  equation  can  be  written  as  follows: 

/a]) 

For  iso-recursive  constructors  of  tuple  kind  (the  only  kind  of  iso-recursive  constructors  generated 
from  ML  datatypes),  the  rule  has  not  to  our  knowledge  been  formalized,  but  can  be  described 
in  words  as:  “Projections  from  iso-recursive  constructors  are  equal  if  their  unrollings  are  equal.” 
(Note  that  this  does  not  imply  that  a  type  is  equal  to  its  unrolling,  only  that  two  types  are  equal  if 
they  unroll  to  equal  things.)  Shao’s  equation  solves  the  problem  in  the  above  example.  The  actual 
definition  of  t2  is  t.2.  The  required  definition  of  t2  is  n^a.l  -|-  int  xt.l.  When  unrolled,  these  both 
become  1  +int  x  t.l,  so  Shao’s  equation  dictates  that  the  t2’s  match. 

While  Shao’s  equation  is  easy  to  implement  and  is  currently  employed  in  the  TILT  compiler,  it 
is  a  troublesome  solution  because  the  effect  the  equation  has  on  the  type  system  is  not  understood 
well  at  all.  Petersen  et  al.  [20]  are  developing  a  different  approach  that  returns  to  the  opaque 
interpretation  of  datatypes  and  involves  the  coercion  calculus.  Assume  that  the  constructors  of  a 
datatype  are  all  condensed  into  one  function  hide  whose  type  is  the  reverse  of  the  expose  function. 
In  other  words,  whereas  previously  each  constructor  would  perform  a  sum  injection  followed  by  a 
roll,  there  is  now  one  constructor  and  it  just  performs  a  roll;  just  as  there  is  only  one  destructor 
and  it  just  performs  an  unroll.  In  this  version  of  the  opaque  interpretation,  the  constructor  and 
destructor  are  merely  static  coercions  (as  in  the  coercion  calculus),  so  Petersen’s  idea  is  to  make 
this  apparent  in  the  datatype  signature  by  declaring  the  constructor  and  destructor  to  be  coercions 
instead  of  ordinary  functions.  This  is  actually  more  complicated  than  it  sounds,  mainly  because 
adding  coercion  variable  declarations  requires  coercions  to  be  threaded  through  the  type  system 
(and  the  compiler)  in  non-trivial  ways.  The  result,  however,  is  that  applications  of  datatype 
constructors  and  destructors  that  cannot  be  inlined  due  to  datatype  opacity  will  eventually  be 
eliminated  because  they  are  known  to  be  coercion  applications. 

4.3.2  Datatypes  and  Rds’s 

Whether  one  adopts  the  opaque  or  transparent  interpretation  of  datatypes  has  a  major  effect  on 
which  kind  of  rds  is  more  appropriate.  First,  suppose  that  datatype  specifications  are  transparent 
and  Shao’s  equation  is  used  to  solve  the  problems  discussed  above.  In  this  situation,  CHP  con¬ 
jecture,  based  on  a  small  example,  that  equi-recursiveness  is  completely  eliminable  for  transparent 
rds’s  whose  static-on-static  recursive  dependencies  are  restricted  to  occur  inside  datatypes,  i.e.  un¬ 
derneath  iso-recursive  constructors.  It  seems  likely  that  this  “datatype-on-static”  restriction,  which 
is  similar  to  our  “generalized  datatype  normal  form”  for  opaque  fixed-points,  works  in  general.  (We 
refer  the  reader  to  CHP  for  a  detailed  description  of  equi-recursiveness  can  be  eliminated.)  How¬ 
ever,  1)  the  success  of  the  CHP  restriction  relies  on  Shao’s  equation,  and  2)  transparent  rds’s  must 
be  fully  transparent.  The  latter  is  a  particularly  odd  requirement  to  justify  to  an  ML  programmer. 

Suppose  now  that  we  adopt  the  opaque  interpretation  of  datatypes.  Since  datatypes  are 
abstract,  references  to  the  recursive  module  variable  within  a  datatype  specification  are  purely 
dynamic-on-static,  occurring  in  the  types  of  the  constructors  and  destructor.  It  is  therefore  per¬ 
fectly  acceptable  for  datatype-on-static  references  to  occur  in  an  opaque  rds.  Thus,  other  than  the 
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prohibition  of  static-on-static  references  in  ordinary  type  definitions,  there  are  no  restrictions  on 
opaque  rds’s. 

The  only  distinction  between  the  opaque  and  transparent  interpretations  of  datatypes  that  is 
visible  to  the  ML  programmer  is  that  the  opaque  interpretation  enforces  datatype  generativity, 
which  is  of  questionable  import  either  way.  We  claim  therefore  that  transparent  rds’s  under  CHP’s 
datatype-on-static  restriction  offer  no  clear  benefits  over  opaque  rds’s,  while  they  require  full  trans¬ 
parency.  In  other  words,  the  very  restriction  suggested  to  make  transparent  rds’s  practical  also 
make  them  a  special  case  of  opaque  rds’s.  Betting  on  the  success  of  Petersen’s  work  on  efficient 
compilation  of  the  opaque  interpretation  of  datatypes,  we  see  no  convincing  arguments  in  favor  of 
transparent  rds’s. 

4.4  Putting  It  Together 

Having  analyzed  the  fixed-point  and  rds  constructs  on  practical  criteria,  we  now  fit  them  together 
into  a  general  design  for  a  recursive  module  extension  to  Standard  ML.  Our  proposal  consists  of  one 
simple  fixed-point  mechanism  and  one  simple  rds  mechanism.  We  will  describe  at  a  high  level  their 
elaboration  into  the  opaque  and  transparent  constructs  we  have  studied.  Since  recursive  modules 
interact  with  so  many  other  features  in  the  language,  it  should  be  noted  that  the  details  of  our 
proposal  are  not  set  in  stone.  Rather,  we  intend  this  to  serve  not  as  a  final  solution,  but  as  a 
starting  point  for  language  designers  and  implementors. 

The  rds  construct  is  a  signature  expression  of  the  form 

sig  rec  <modname>  in  <specs>  end 

where  <modname>  is  the  recursive  module  variable  and  <specs>  is  a  sequence  of  type,  datatype, 
value  and  module  specifications  comprising  the  signature  body.  Datatype  and  value  specifications 
are  permitted  to  contain  references  to  <modname>,  but  transparent  type  specifications  are  not. 
The  sig  rec  construct  is  implemented  directly  as  an  opaque  rds  —  we  are  assuming  the  opaque 
interpretation  of  datatypes  —  and  obeys  the  rds  roll  and  unroll  rules  presented  in  Section  2.4. 

The  fixed-point  construct  is  a  structure  expression  of  the  form 

struct  rec  <modnaine> :  <sig>  in  <decls>  end 

where  <niodname>  is  the  recursive  module  variable,  <sig>  is  the  rds,  and  <decls>  is  a  sequence  of 
type,  datatype,  value  and  module  declarations  comprising  the  module  body.  The  body  is  subject  to 
the  same  restriction  as  rds’s:  datatype  and  value  declarations  are  permitted  to  contain  references 
to  <modnaine>,  but  transparent  type  declarations  are  not.  Elaboration  of  the  body  is  quite  a  bit 
more  complicated  because  it  involves  a  combination  of  opaque  and  transparent  fixed-points. 

The  basic  idea  is  to  use  transparent  fixed-points  as  the  main  mechanism  for  implementing 
struct  rec  since  it  is  more  appropriate  for  general  programming,  and  to  use  opaque  fixed-points 
to  handle  the  datatype  declarations.  This  approach  gives  us  the  best  of  both  worlds:  ease  of 
programming  and  polymorphic  recursion  from  transparent  fixed-points,  module-recursive  datatypes 
from  opaque  fixed-points.  The  main  issue  in  this  setup  is  that  the  specified  rds  <sig>  is  not  required 
to  be  transparent,  so  we  need  to  reify  it  with  the  implementation  of  types  in  the  body.  To  accomplish 
this,  there  is  a  first  pass  over  the  module  body  that  extracts  all  the  datatype  and  type  declarations. 
These  are  converted  into  generalized  datatype  normal  form  and  then  “solved”  with  an  opaque 
fixed-point.  The  solutions  of  the  module-recursive  datatypes  given  by  the  opaque  fixed-point  are 
linked  back  into  the  module  body  through  datatype  copying.  For  instance,  suppose  that  the  opaque 
fixed-point  is  called  Static  (for  the  “static”  part  of  the  module  body)  and  a  datatype  dt  is  defined 
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in  substructure  A  of  the  body.  Then  dt’s  definition,  having  been  extracted  into  the  definition  of 
Static,  is  replaced  in  the  body  by 

datatype  dt  =  datatype  Static. A. dt 

which  copies  the  type  dt  along  with  its  constructors  and  destructor. 

It  is  important  here  to  mention  that  it  is  not  always  the  case  that  the  type  declarations  can  be 
extracted  and  solved  separately.  Specifically,  if  the  body  contains  module  abstraction  operations  of 
the  form  Mod  :  >  Sig,  and  Sig  contains  an  abstract  type  specification  such  as  type  t,  then  writing 
the  transparent  definition  of  Mod .  t  separately  would  break  abstraction.  We  thus  require  that  the 
static  part  of  the  body,  except  for  datatypes,  be  fully  transparent,  which  bans  most  uses  of  opaque 
signature  ascription.  We  will  return  to  this  restriction  and  its  implications  in  Section  5  on  future 
work. 

Thanks  to  this  restriction,  however,  replacing  the  module-recursive  datatype  definitions  with 
links  to  their  solutions  in  Static  removes  all  module-recursiveness  from  the  static  part  of  the  body. 
It  is  thus  ripe  for  implementation  with  a  transparent  fixed-point,  as  soon  as  the  types  in  Static  are 
used  to  fill  in  the  abstract  types  in  the  rds.  The  remainder  of  elaboration  reduces  to  transparent 
fixed-point  typechecking,  that  is,  the  body  must  satisfy  the  specifications  of  the  rds  and  be  valuable, 
in  a  context  where  the  recursive  module  variable  is  assigned  the  reified  rds  and  is  not  valuable. 

A  note  on  the  valuability  restriction:  In  this  paper,  we  have  directly  adopted  CHP’s  use  of  the 
fixed-point  expression  fix{x:a.e)  in  phase-splitting  fixed-point  modules,  thus  taking  the  compilation 
of  “dynamic-on-dynamic”  references  in  recursive  modules  for  granted.  We  have  done  so  because  the 
primary  focus  of  this  paper  has  been  on  compiling  module-recursive  references  to  types,  not  terms. 
In  practice,  however,  imposing  the  valuability  restriction  on  the  entire  recursive  module  body  may 
prove  to  be  a  hard  sell  to  ML  programmers.  It  is  simply  an  unintuitive  hindrance,  for  instance,  to  be 
prohibited  from  defining  any  top-level  non-module-recursive  ref  cells  as  one  might  use  to  implement 
“flags”  (e.g.  val  x  =  ref  true)  in  a  recursive  module.  This  might  suggest  that  if  module-recursive 
references  were  restricted  to  occur  under  A- abstractions,  as  is  implied  by  the  valuability  restriction, 
we  could  get  away  without  the  full  restriction.  A  simple  example  demonstrates  that  this  is  a  fallacy 
(here,  M  is  the  recursive  module  variable,  and  the  following  code  appears  in  the  module  body): 

fun  f  0  =  M.x 
val  X  =  f() 

The  valuability  restriction  catches  the  infinite  loop  in  the  definition  of  x  because  the  non-valuability 
of  M  means  that  f  is  a  partial  function  and  its  application  is  not  valuable.  Finding  an  intuitive 
middle  ground  between  these  two  levels  of  restrictions  will  be  an  important  prerequisite  for  the 
acceptance  of  our  recursive  module  proposal. 

Another  noteworthy  issue  involves  the  ability  of  the  module  body  to  contain  declarations  that 
are  not  specified  by  the  rds.  So  long  as  the  body  provides  all  the  components  the  rds  requires, 
it  seems  intuitive  that  providing  more  should  be  allowed,  in  the  same  way  that  a  module  in  ML 
may  be  ascribed  a  signature  that  specifies  a  proper  subset  of  the  components  the  module  actually 
provides.  However,  allowing  for  the  body  to  declare  more  than  is  specified  introduces  some  slight 
complication  because,  in  a  transparent  fixed-point,  the  body  must  match  the  rds  exactly.  This  can 
be  solved  by  first  generating  a  total  functor  that  coerces  a  module  from  the  full  signature  of  the 
module  body  to  the  rds.  Then,  the  application  of  that  functor  coercion  to  the  original  module  body 
matches  the  rds  and  can  be  used  as  the  body  of  the  transparent  fixed-point.  Harper  and  Stone  [10] 
show  how  to  generate  such  functor  coercions  automatically,  as  they  are  the  same  sort  of  coercions 
that  are  generated  during  the  elaboration  of  signature  ascription  and  functor  application. 
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Finally,  for  the  design  to  be  successful,  there  must  be  a  straightforward  way  to  explain  recursive 
module  typechecking  to  the  ML  programmer  without  referring  to  opaque  and  transparent  fixed- 
points.  Aside  from  the  valuability  restriction,  which  seems  unavoidable,  typechecking  for  our 
proposed  recursive  modules  is  actually  very  intuitive  and  can  be  presented  as  follows: 

1.  The  purpose  of  associating  an  rds  with  a  struct  rec  is  to  specify  what  the  module  body  is 
allowed  to  know  about  itself  when  it  makes  module-recursive  references. 

2.  Any  types  that  appear  abstract  in  the  rds  are  really  equal  to  their  implementations  in  the 
module  body. 

3.  Module-recursive  references  may  occur  in  datatype  declarations  and  inside  function  defini¬ 
tions,  but  nowhere  else. 

4.  Opaque  ascription  and  functor  applications  resulting  in  the  definition  of  abstract  types  are 
not  permitted  in  the  module  body. 

5.  Assume  (as  alpha-equivalence  allows)  that  the  recursive  module  variable  is  the  same  in  both 
the  recursive  module  and  the  associated  rds.  Then  the  recursive  module  typechecks  if  the 
module  body  provides  the  specifications  in  the  rds  body  and  the  module  body  is  valuable. 

The  last  part  of  the  intuitive  typechecking  description  is  a  rewording  of  the  “fundamental  property 
of  recursive  modules”  studied  in  Section  3.4.  Rather  than  saying  that  the  module  body  should 
match  the  rds,  we  say  it  should  match  the  body  of  the  rds,  which  is  an  equivalent  condition  for 
transparent  fixed-points  but  easier  to  understand. 

5  Directions  for  Future  Work 

Throughout  our  entire  study  of  recursive  modules,  with  the  exception  of  disallowing  the  use  of 
opaque  ascription  in  the  bodies  of  struct  rec’s,  we  have  made  hardly  any  mention  of  abstraction. 
The  reason  is  that  the  phase-distinction  framework  of  Harper,  Mitchell  and  Moggi,  which  serves 
as  the  very  basis  of  our  analysis,  does  not  model  abstraction  at  all.  It  is  not  a  mere  oversight: 
blithely  phase-splitting  a  module  abstraction  (such  as  a  module  of  the  form  Mod  :  >  Sig)  would 
expose  the  implementation  of  the  static  part  of  the  module,  breaking  abstraction.  We  have  already 
discussed  the  importance  of  ameliorating  the  harshness  of  the  valuability  restriction  as  one  avenue 
of  future  work.  In  this  section  we  will  look  at  two  other  directions  for  future  research  involving 
the  interplay  of  recursive  modules,  phase-splitting  and  abstraction  —  namely,  separate  compilation 
and  higher-order  modules. 

5.1  Separate  Compilation 

A  desired  feature  of  recursive  modules  that  our  proposal  does  not  provide  is  the  ability  to  separately 
compile  mutually  recursive  modules.  CHP  describe  how  to  separately  compile  substructures  of  a 
transparent  recursive  module  using  functors.  The  idea  is  that  each  substructure  can  be  written 
separately  by  abstracting  over  the  recursive  module  variable  (assuming  that  the  substructures  only 
refer  to  each  other  through  the  recursive  module  variable).  Then,  a  transparent  recursive  module 
is  used  to  join  the  implementations,  instantiating  each  functor  with  the  actual  recursive  module 
variable.  For  instance,  in  the  ExpDec  example,  Exp  and  Dec  could  be  compiled  separately  as  follows: 
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functor  ExpFun  (structure  ExpDec  :  EXPDEC)  =  ... 
functor  DecFun  (structure  ExpDec  :  EXPDEC)  =  ... 
structure  ExpDec  =  tfix  (ExpDec  :  EXPDEC. 
structure  Exp  =  ExpFun (ExpDec) 
structure  Dec  =  DecFun(ExpDec) 
end 

The  problem  in  CHP  is  that  EXPDEC  is  required  to  be  fully  transparent.  This  limits  the  usefulness 
of  separate  compilation  by  forcing  the  implementations  of  all  the  types  to  be  settled  on  beforehand. 

Our  proposed  struct  rec  appears  on  its  face  to  be  more  amenable  to  separate  compilation  than 
CHP  because  there  is  no  transparency  requirement  on  the  rds.  However,  struct  rec  elaboration 
involves  reification  of  the  entire  opaque  rds  prior  to  typechecking  of  (the  dynamic  part  of)  the 
body,  so  it  would  seem  that  the  implementation  of  all  the  types  still  needs  to  be  known  in  advance 
in  order  to  compile  any  one  submodule  separately.  In  fact,  though,  this  is  an  instance  where  our 
proposal  is  overly  simplistic.  The  purpose  of  the  two-pass  elaboration  is  to  ensure  that  all  references 
to  a  type  defined  in  the  recursive  module,  whether  they  be  local  or  module-recursive  references, 
are  equal  (e.g.  there  is  no  “opaque”  distinction  between  t  and  List.t).  Reifying  the  entire  rds 
before  elaborating  the  body  the  second  time  is  an  overly  crude  way  of  achieving  this  goal  because 
it  makes  all  the  submodules  fully  transparent  to  one  another. 

One  could  imagine  a  more  refined  design  taking  the  form  of  a  bundle  of  mutually  recursive 
module  declarations  (comprising  one  big  module),  each  of  whose  implementations  would  be  trans¬ 
parent  to  itself  but  opaque  to  all  the  other  modules.  In  order  to  prevent  the  inter-module  opacity 
from  re-introducing  the  undesirable  “opaque”  distinction,  we  could  require  that  any  inter-module 
references  be  interpreted  as  “bundle-recursive”  references  (i.e.  module-recursive  references  to  the 
recursive  bundle  variable)  rather  than  local  references  within  the  bundle.  A  simple  syntax  might 
be 

structure  rec  Mi  :>  Si  =  implementation  of  Mi 
and  ...  and  :>  Sn  =  implementation  of  M„ 

where  the  associated  rds  implied  by  this  notation  is 

sig  rec  M  in  (*  Inter-module  references  replaced  by  *) 

structure  Mi  :  S'i[M.Mj/Mj]igi..n  (*  bundle-recursive  references  to  M.  *) 

structure  Mn  ■  Sn[M.Mi/Mi]i^i,,n 

end 

Although  the  elaboration  of  the  whole  bundle  still  requires  the  reification  of  the  entire  rds  because 
of  our  interpretation  of  recursive  modules  into  transparent  fixed-points,  the  typechecking  of  each 
particular  Mj  would  only  require  type  information  gleaned  from  Mi  alone  in  order  to  reify  Si. 

A  design  along  these  lines  lends  itself  more  to  separate  compilation  than  our  original  proposal 
because  each  submodule  need  know  nothing  about  the  implementation  of  the  types  in  the  other 
submodules.  In  addition,  by  making  the  submodules  opaque  to  one  another,  it  can  be  viewed 
as  a  way  of  introducing  abstraction  into  the  body  of  a  recursive  module,  which  is  not  possible 
in  our  original  proposal.  Nevertheless,  to  actually  separate  the  submodules,  we  still  cannot  use 
functors  in  the  straightforward  way  that  CHP  did  because  functors  are  incapable  of  capturing  the 
special  two-pass  elaboration  of  struct  rec.  We  would  need  to  develop  a  new  separate  compilation 
mechanism,  which  might  only  be  able  to  do  separate  typechecking,  not  full  compilation.  As  pointed 
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out  in  recent  work  by  Pierce  and  Harper  [21],  functors  may  not  be  appropriate  as  a  general  separate 
compilation  mechanism  anyway,  so  a  special  construct  for  separately  compiling  recursive  modules 
may  be  in  order. 

In  any  case,  it  should  be  understood  that  separate  compilation  and  separate  typechecking  are 
predicated  on  the  specification  of  the  rds  a  priori.  Since  the  rds  specifies  what  each  submodule  may 
know  about  the  others,  changes  to  it  potentially  affect  the  validity  of  all  submodules  and  may  thus 
require  recompilation  of  all  of  them.  On  the  flip  side,  some  may  not  care  so  much  about  separate 
compilation  as  about  the  ability  to  simply  write  mutually  recursive  modules  in  separate  files  that 
are  compiled  together.  Such  a  “separate-location”  mechanism  is  not  fundamentally  any  different 
from  the  structure  rec  bundle  declaration  sketched  above. 

5.2  Higher-Order  Modules 

A  classic  problem  that  nicely  motivates  recursive  modules  and  that  the  usual  workarounds  described 
in  Section  1  fail  to  work  around  is  data- structural  bootstrapping  (see  Okasaki  [18]).  For  example, 
suppose  we  wish  to  implement  a  “set  of  sets” ,  defined  to  be  a  datatype  sos  that  is  either  an  integer, 
say,  or  a  set  of  sos’s  constructed  by  means  of  the  predefined  total  functor  MakeSetFn.  Assume  that 
MakeSetFn  is  defined  in  the  following  way: 

signature  KEY  =  sig  type  key;  val  compare  :  key  *  key  ->  order  end 
signature  SET  =  sig  type  element;  type  set;  (*  set  functions  *)  end 
functor  MakeSetFn  (Key: KEY)  :  SET  where  type  element  =  Key. key 
=  struct  . . .  end 

Now,  we  can  define  sos  using  struct  rec  rather  easily: 

signature  SOS  =  sig  rec  Sos  in 

datatype  sos  =  Int  of  int  |  Set  of  Sos . SosSet . set 
structure  SosSet  :  SET  where  type  element  =  sos 
end 

structure  Sos  =  struct  rec  Sos  :  SOS  in 

datatype  sos  =  Int  of  int  |  Set  of  Sos . SosSet . set 
structure  SosKey  =  struct 
type  key  =  sos 

fun  compare  (sosl,sos2)  =  ... 
end 

structure  SosSet  =  MakeSetFn (SosKey) 
end 

It  is  important  here  that  the  datatype  definition  come  before  the  functor  application  so  that 
the  transparent  type  references  type  element  =  sos  in  SOS  and  type  key  =  sos  in  Sos  are  not 
module-recursive . 

Unfortunately,  our  success  in  coding  this  example  was  subtly  dependent  on  our  use  of  transpar¬ 
ent  ascription  in  defining  MakeSetFn.  If  we  had  used  opaque  ascription  and  hidden  the  implemen¬ 
tation  of  the  set  type,  the  application  of  MakeSetFn  in  the  body  of  Sos  would  have  been  invalid. 
The  reason  is  that  functors  in  ML  are  generative,  that  is,  every  application  of  a  functor  that  has 
abstract  types  in  its  result  signature  generates  a  fresh  abstract  type,  and  abstract  types  are  not 
allowed  in  the  body  of  a  struct  rec. 

One  might  expect  this  problem  to  be  solved  by  Leroy’s  applicative  functors  [13],  which  are 
used  in  the  Objective  CAML  dialect  of  ML  [14].  Applicative  functors  do  not  generate  fresh  types 
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every  time  they  are  applied,  so  even  if  MakeSetFn  were  defined  opaquely,  every  application  of 
MakeSetFn(SosKey)  would  result  in  the  same  set  type.  To  achieve  this,  Leroy  allows  functor 
applications  to  occur  in  type  paths,  so  the  result  of  MakeSetFn(SosKey)  has  the  fully  transparent 
signature 

sig  type  element  =  sos;  type  set  =  MakeSetFn (SosKey) . set ;  ...  end 

If  MakeSetFn  were  generative,  which  is  the  only  possibility  in  Standard  ML,  MakeSetFn(SosKey)  .  set 
would  not  even  be  well-defined.  There  is  still  a  problem,  however:  Leroy  restricts  functor  applica¬ 
tions  in  paths  to  be  in  named  form,  i.e.  they  can  only  refer  to  structure  paths,  not  actual  structure 
definitions.  Our  elaboration  for  struct  rec  requires  that  we  be  able  to  extract  the  definition  of  set 
into  the  opaque  fixed-point  that  solves  the  static  part  of  the  module.  In  that  opaque  fixed-point, 
SosKey  will  only  consist  of  the  static  part  of  the  definition  of  SosKey  in  the  module  body,  so  the 
functor  application  below  is  no  longer  valid  since  the  static  part  of  SosKey  does  not  have  signature 
KEY: 

structure  StaticPartOf Sos  =  ofix  (... 
datatype  sos  =  ... 

structure  SosKey  =  struct  type  key  =  sos  end 
structure  SosSet  =  struct 
type  element  =  sos 

type  set  =  MakeSetFn(SosKey) . set  (*  Error:  SosKey  has  wrong  signature  *) 
end 

) 

Leroy’s  applicative  functors  were  motivated  by  a  desire  to  add  type-theoretic  support  to  Stan¬ 
dard  ML  for  fully  transparent  higher-order  modules  in  the  presence  of  abstraction  [15].  Shao  has 
also  presented  a  type  theory  for  higher-order  modules  that  extends  applicative  functors  and  solves 
the  problems  we  encountered  with  applicative  functors  using  “fully  syntactic  signatures”  [24].  In 
his  system,  the  set  type  above  would  be  expressed  as  MakeSetFn({type  key  =  sos}). set,  where 
MakeSetFn  denotes  the  static  part  of  MakeSetFn  —  in  essence,  Fst  (MakeSetFn).  This  type  could 
be  extracted  into  the  opaque  fixed-point  with  no  difficulty. 

The  reason  Shao’s  system  is  successful,  as  he  points  out,  is  that  his  general  approach  is  “to 
adapt  and  incorporate  the  phase-splitting  interpretation  of  higher-order  modules  into  a  surface 
module  calculus”.  However,  Shao  does  not  actually  make  use  of  a  full  phase-distinction  calculus, 
and  it  is  unclear  why  not.  One  reason  may  be  that  he  does  not  use  singleton  kinds  to  express 
type  definitions  in  signatures,  and  it  is  difficult  to  equate  an  arbitrary  translucent  signature  with  a 
primitive  signature  of  the  form  [a:K.a]  without  singleton  kinds. 

During  our  study  of  recursive  modules,  one  of  our  sanity  checks  has  been  to  attempt  to  formalize 
the  elaboration  of  our  ML  extension  in  the  Harper-Stone  framework.  The  lack  of  higher-order 
modules,  let  alone  phase-splitting,  in  the  HIL  —  the  high-level  intermediate  language  that  serves 
as  the  target  of  Harper-Stone  elaboration  —  made  extension  of  the  HIL  with  fixed-point  and  rds 
constructs  rather  difficult.  The  opaque  fixed-point  typechecking  rule,  for  example,  relies  on  the 
ability  to  compare  the  phase-split  forms  of  the  rds  and  body  signatures.  The  distinction  between 
static-on-static  and  dynamic-on-static  dependencies  that  defines  the  rds  typechecking  rule  is  hard 
to  express  in  a  calculus  without  phase-splitting,  as  is  the  equivalence  between  rds’s  and  other 
signatures  that  the  non-standard  equational  rules  of  HMM  induce.  We  therefore  believe  it  would 
be  a  useful  contribution  to  the  theory  and  practice  of  recursive  and  higher-order  modules  to  devise 
a  way  to  combine  a  full  phase-splitting  calculus  along  the  lines  of  HMM  with  a  mechanism  for 
abstraction. 
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6  Related  Work 


Flatt  and  Felleisen  [8]  have  studied  recursive  modules  in  the  form  of  units.  Units  are  essentially 
first-class  modules,  i.e.  modules  that  can  be  passed  as  values,  dynamically  linked  and  “invoked”  at 
run-time.  Each  unit  consists  of  a  set  of  import  declarations,  a  body  containing  type  and  function 
definitions,  an  initialization  expression  that  is  executed  when  the  unit  is  invoked,  and  a  set  of 
export  declarations.  The  recursive  aspect  of  units  comes  into  play  in  the  linking  of  two  units  into 
a  compound  unit:  the  first  unit’s  exports  may  be  used  to  satisfy  the  second  unit’s  imports,  while 
the  second  unit’s  exports  are  simultaneously  used  to  satisfy  the  first  unit’s  imports. 

Units  were  designed  to  serve  as  the  module  system  for  the  MzScheme  programming  environment 
for  Scheme,  which  is  no  surprise  considering  the  emphasis  on  dynamic,  first-class  modules.  Only 
after  Flatt  and  Felleisen  define  the  system  over  a  dynamically- typed  (i.e.  untyped)  core  language  do 
they  consider  the  implications  of  type  declarations,  so  issues  involving  types  are  not  at  the  center 
of  the  design  philosophy.  Although  the  extensions  to  units  for  dealing  with  type  declarations  are 
reasonable  (e.g.  using  type  dependency  declarations  to  check  that  there  are  no  cyclic  dependencies 
between  transparent  type  definitions  in  separate  units),  the  fact  that  units  are  first-class  precludes 
them  from  providing  fine-grained  control  over  type  propagation.  For  instance,  as  Flatt  and  Felleisen 
point  out,  ML- style  type  sharing  is  not  supported  by  units,  resulting  in  crude  resolutions  to  situ¬ 
ations  like  the  “diamond  import  problem”.  On  the  other  hand,  units  make  separate  compilation 
easy. 

Duggan  and  Sourelis  [6,  7]  have  presented  a  recursive  module  extension  to  ML  in  the  context 
of  mixin  modules.  Mixins  were  originally  studied  [6]  as  a  mechanism  akin  to  virtual  types  but  in  a 
non-object-oriented  (or  “data-oriented”)  setting.  They  are  defined  using  a  special  mixin  construct 
consisting  of  two  sections:  the  first  containing  mutually  recursive  type  and  function  definitions, 
the  second  containing  initialization  code  as  in  units.  One  can  write  two  separate  mixins,  each  of 
which  defines  different  branches  of  the  same  datatype(s)  along  with  functions  that  operate  over 
the  branches  it  defines.  Then,  the  two  mixins  can  be  combined  with  the  mixcomp  construct, 
which  “merges”  the  type  and  term  fixed-points.  Typically,  in  an  object-oriented  language,  one 
could  achieve  something  similar  by  defining  an  abstract  or  virtual  class  with  each  of  the  mixins  as  a 
subclass,  but  mixins  have  the  advantage  that  they  do  not  require  runtime  type  checks.  Programming 
with  mixins  enables  a  certain  kind  of  recursive  module  programming.  Mutually  recursive  modules 
can  be  written  separately  as  mixins  by  replacing  inter-module  references  with  references  to  local 
“stub”  definitions  of  datatypes  and  functions.  When  a  mixcomp  is  performed,  these  stubs  get 
merged  with  the  actual  definitions  of  types  and  functions  from  the  other  modules. 

In  their  later  work  [7],  Duggan  and  Sourelis  propose  a  mixlink  construct  that  allows  for  more 
natural  recursive  module  programming  and  has  some  interesting  correspondences  with  our  proposal, 
mixlink  is  used  to  declare  (or  “link”)  a  bundle  of  mutually  recursive  mixin  modules,  which  are 
divided  into  two  groups.  The  typing  rules  differ  for  the  two  groups  in  very  subtle  ways,  but  it 
appears  that  the  first  group  are  like  a  simplified  version  of  the  structure  rec  declarations  sketched 
in  Section  5.1,  while  the  second  group  are  similar  to  opaque  fixed-points  in  that  they  introduce  an 
“opaque”  distinction  between  the  types  defined  in  their  bodies  and  bundle-recursive  references  to 
those  types,  mixlink  is  capable  of  expressing  the  “set  of  sets”  example  discussed  in  Section  5.2, 
but  requires  the  functor  application  to  occur  in  the  second  section  of  the  mixlink.  Unfortunately, 
the  construct  is  not  amenable  to  separate  compilation  in  the  way  that  mixcomp  is.  Moreover,  it  is 
simply  unclear  why  the  two  different  styles  of  recursive  module  programming  afforded  by  mixcomp 
and  mixlink  are  tied  to  the  same  concept  of  a  mixin  module. 

Claudio  Russo  has  designed  a  recursive  module  extension  for  Standard  ML  and  implemented  it 
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in  the  Moscow  ML  2.00  compiler  [1].  The  only  documentation  seemingly  available  regarding  this 
extension  is  on  pages  23-24  of  the  Moscow  ML  Owner’s  Manual  [22].  What  can  be  discerned  from 
this  is  that  Russo’s  extension  is  cosmetically  similar  to  ours,  defining  two  new  constructs  analogous 
to  fixed-points  and  rds’s.  In  his  extension,  however,  both  module  and  signature  constructs  have 
the  same  basic  syntax: 


rec  (X : <f orwardsig>) .  <mod  |  sig> 

Here,  <forwardsig>  is  used  to  give  forward  declarations  of  any  types  or  values  that  are  referred 
to  module-recursively  in  the  fixed-point  body  (<mod>)  or  rds  body  (<sig>).  In  our  extension,  sig 
rec’s  have  no  need  for  forward  declarations  because  the  validity  of  module-recursive  references  in 
the  rds  body  can  be  determined  through  phase-splitting  it.  In  addition,  the  rds  associated  with  a 
struct  rec  supplies  the  signature  for  the  entire  module,  not  just  the  forward  declarations. 

An  odd  feature  of  Russo’s  extension  is  that  “any  opaque  type  or  datatype  specified  in  the 
signature  [associated  with  a  recursive  module]  must  be  implemented  in  the  body  [of  the  module] 
by  copying  it  using  a  forward  reference”,  demonstrated  by  the  following  example  taken  verbatim 
from  the  manual: 

structure  Ok  =  (*  well-typed  *) 

rec  (X:sig  datatype  t  =  C  type  u  type  v  =  int  end) 

struct  datatype  t  =  datatype  X.t  type  u  =  X.u  type  v  =  int  end 

structure  Fail  =  (*  ill-typed  *) 

rec  (X:sig  datatype  t  =  C  type  u  type  v  =  int  end) 

struct  datatype  t  =  C  type  u  =  int  type  v  =  int  end 

Perhaps  the  reason  this  seems  so  odd  is  that  in  our  extension,  the  analogous  encoding  of  structure 
Ok  would  be  ill-typed  and  the  analogous  encoding  of  structure  Fail  would  be  well-typed.  Still,  it  is 
not  even  clear  what  the  definition  of  type  u  in  Ok  is  supposed  to  be,  or  what  is  wrong  with  giving 
it  a  definition  in  Fail.  At  any  rate,  it  indicates  a  marked  difference  from  our  proposal,  but  the 
extent  to  which  the  designs  differ  is  impossible  to  determine  from  our  limited  understanding  of  the 
theory  and  details  of  Russo’s  extension. 
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A  Type  Theory  of  HMM  +  Singleton  Kinds  +  Recursive  Modules 

A.l  Core  Calculus 


F  h  K  kind 

_  _  F  h  c  :  r 

F  h  T  kind  F  h  1  kind  F  h  S(c)  kind 

a  0  Dom(F)  F  h  ki  kind  F[a  :  ki]  h  K2  kind 

F  h  Ila-.Ki.K2  kind 

a  0  Dom(F)  F  h  ki  kind  F[a  :  ki]  h  K2  kind 

F  h  'Pa:Ki.K2  kind 


T  \-  Ki  =  K2  kind 


F  h  Cl  =  C2  :  r 

F  h  T  =  T  kind  F  h  1  =  1  kind  F  h  S(ci)  =  S(c2)  kind 

a  0  Dom(F)  F  h  ki  =  k\  kind  F[a  :  ki]  h  K2  =  K'2  kind 
F  h  Ila:Ki.K2  =  na:KpK2  kind 
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a  0  Dom(r)  r  h  =  k[  kind  r[a  :  ki]  h  K2  =  ^2  kind 
r  h  Sa:Ki.K2  =  Sa:Ki.K2  kind 


r  h  T  <  T  A:mrf  F  h  1  <  1  kind 

r h Cl  =  C2 : r  rhc:r 

r  h  5(ci)  <  S(c2)  A:m(i  T  h  S(c)  <  T  kind 

a  0  Dom(r)  r  h  <  ki  kind  r[a  :  k'^]  \-  K2  <  n'2  kind  r[a  :  ki]  h  K2  kind 

r  h  Iia:Ki.K2  <  na:K']^.K2  kind 

a  0  Dom(r)  r  h  Ki  <  kind  r[a  :  ki]  h  K2  <  >^2  kind  r[a  :  k']\  h  K2  kind 

r  h  T,a:Ki.K2  <  Sa:KpK2  kind 


a  \  K  £T  at^GT 

rhaiK  rhaiK  Fh*:! 

a  0  Dom(F)  F  h  ki  A:mrf  F[q;  :  ki]  h  c  :  K2  F  h  ci  :  Iia-.K1.K2  F  h  C2  : 

F  h  Xa-.Ki.c  :  F[a:Ki.K2  T  h  C1C2  :  K2[c2/a] 

a  0  Dom(F)  F  h  ci  :  ki  F  h  C2  :  K2[ci/q;]  F[q;  :  ki]  h  K2  kind 

F  h  (ci,  C2)  :  i;a:Ki.K2 

F  h  c  :  Tja-.Ki.K2  F  h  c  :  Ta-.K1.K2 
F  h  c.l  :  Ki  F  h  c.2  :  K2[c.l/a] 

_  F  h  Cl  :  r  F  h  C2  :  r  F  h  ci  :  T  F  h  C2  :  F 

FhliT  FI-ci^C2:T  FI-ciXC2:T 

a  0  Dom(F)  F  h  k  kind  F[a  I  «;]  h  c  J,  k 
F  h  ^=a-.K.c  -.  K 

T  \-  C-.  k'  T  \-  k!  <  k  kind  T  \-  c:T 
T  \-  c  -.  K  F  h  c  :  S(c) 

a  0  FV(c)  F  h  \a:Ki.ca  :  ila:Ki.K2 
F  h  c  :  ila:Ki.K2 
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F  h  (c.l,c.2)  :  Ta:Ki.K2 
F  h  c  :  Ta:Ki.K2 


m 


Thcln 


r[s]  = 


r[Q;  :  k]  if  [B]  is  [a  ]  k] 
r[s  :  S]  if  [B]  is  [s  T  S] 
r[i?]  otherwise 


Thc  =  c’  :k  The'  i 
T  C  I  K 

Thci 

a  :  K  GT  _ 

Thai  rh*i 

a  0  Dom(r)  r  h  K  kind  r[Q;  :  k]\-  c  I  FhciJ,  ri-C2i 
r  h  \a:K.c  I  r  h  C1C2  i 


rhcii  rhc2i  rhei 
r  h  (ci,  C2)  i  r  h  c.i  I 


rhij,  ri-ci^c2i  ri-cixc2i 

a  0  Dom(r)  r  h  K  kind  r[a  I  k]  h  c  | 
r  h  ii=a:K.c  I 

r  h  Cl  =  C2  :  K 

r  h  c  :  K  r  h  C2  =  Cl  :  K 
ri-c  =  c:K  ri-ci  =  C2:K 

r  h  Cl  =  C2  :  K  r  h  C2  =  C3  :  K 
r  h  Cl  =  C3  :  K 


a  0  Dom(r)  r  h  Ki  =  /t'^  kind  r[a  :  ki]  h  c  =  c'  :  K2 
r  h  \a:Ki.c  =  Xa-.n'i.c'  :  Iia:ni.K2 


T  \-  ci  =  c\:  Iia-.Ki.K2  F  h  C2  =  C2  :  ki 
r  h  C1C2  =  c^C2  :  K2[c2/a] 

a  0  Dom(r)  r  h  Cl  =  c'^  :  Ki  F  h  C2  =  C2  :  K2[ci/q;]  F[q;  :  Ki]  h  K2  kind 

F  h  (ci,C2)  =  (ci,C2)  :  Tja-.Ki.K2 

T  c  =  c'  :  Ta:Ki.K2 
F  h  c.l  =  c'.l  :  Ki 

F  h  c  =  c'  :  Ta-.K1.K2 
F  h  C.2  =  c' .2  -.  K2[c.l/a] 
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r  h  Cl  =  c;  :  T  r  h  C2  =  c'2  :  T 
r  h  Cl  ^  C2  =  c'^  ^  C2  :  T 

r  h  Cl  =  c'l  :  T  r  h  C2  =  c'2  :  T 
r  h  Cl  X  C2  =  c'^  X  C2  :  T 

a  0  Dom(r)  T  \-  K  =  n'  kind  r[Q;  :  k]  h  c  =  c'  :  k 
r[a  I  k]  h  c  I  K  r[a  t  1“  c'  I  k; 
r  h  ^=a:K.c  =  ^=a:n' .c'  :  k 

r  h  Cl  =  C2  :  k'  T  k'  <  k  kind 
r  h  Cl  =  C2  :  K 

The:  S(c')  rhc  =  c':T  The:! 

rhc  =  c':T  Thebe':  S(c)  T  h  c  =  *  :  1 

a  0  Dom(r)  r  h  Cl  :  Ki  r[a  :  ki]  h  C2  :  K2 
r  h  (Aq;:ki.C2)ci  =  C2[ci/q;]  :  K2[ci/a\ 

a  0  Dom(r)  r  h  Cl  :  na:Ki.«;2  T  h  ci  :  na:Ki.«;2  r[a  :  ki]  h  cia  =  C2a  :  K2 

r  h  Cl  =  C2  :  Iia-.ni.K2 

r  h  Cl  :  Ki  r  h  C2  :  /i2  / .  ,  r,^ 

P  ,  / - r— -  =  1,2 

r  h  (Cl,C2).Z  ^  Ci'.  Ki 

a  0  Dom(r)  T  \- '£a:Ki.K2  kind 

r  h  ci-l  =  C2.I  :  Ki  r[Q;  :  Ki]  h  Ci.2  =  C2.2  :  K2[ci.l/a] 
r  h  Cl  =  C2  :  T,a-.Ki.K2 

a  0  Dom(r)  r  h  K  kind  r[a  t  1“  c  J,  k 
r  h  fi=a:K.c  =  c[(^=a:K.c)/a]  :  k 

a  0  Dom(r)  r  h  K  kind  F  h  c  =  c'[c/q;]  :  k  r[a  ]  k]\-  c'  I  k 

r  h  c  =  ii=a:K.c'  :  k 


r  h  c  :  T  r  h  (Ji  type  T 'r  <72  type 

r  h  T(c)  type  F  h  ci  ^  cj2  type 

F  h  cji  type  F  h  CJ2  type  F  h  ci  type  F  h  cj2  type 
F  h  cJi  ^  CJ2  type  F  h  cJi  x  1T2  type 
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a  0  Dom(r)  r  h  K  kind  r[a  :  k]  h  cj  type 
r  h  \/a:K.a  type 


r  h  cJi  =  iT2  type 


r  h  O'  type  r  h  1T2  =  type 

T  \-  a  =  a  type  F  h  ai  =  a2  type 

r  h  cTi  =  CJ2  type  F  h  1T2  =  CJ3  type 
r  h  fji  =  CJ3  type 

rFc=c'  :T 


r  h  T(c)  =  r(c')  type 


r  h  Cl  :  T  r  h  C2  :  r  r  h  Cl  :  T  r  h  C2  :  r 

r  I-  r(ci  ^  C2)  =  r(ci)  ^  T{c2)  type  r  l-  r(ci  x  C2)  =  T(ci)  x  T(c2)  type 


r  h  cji  =  a'l  type  T  F  02  ^  type  F  h  cri  =  a'^  type  F  h  (T2  = 


F  h  cji  ^  (T2  =  ^  a'2  type 


F  h  cJi  X  (T2  =  ct'^  X  a'2  type 


F  h  (Ti  =  a'l  type  F  h  (T2  =  type 
T  ai  ^  a2  ^  cr'i  ^  CJ2  type 


F  h  e  :  cj 


a  0  Dom(F)  F  F  k  =  k'  F[a  \  k]F  a  =  a'  type 
F  h  Fa-.K.a  =  Vq;:«;'.cj'  type 


X  :  a  gF  X ] a  gF  _ 

F  F  X  :  a  F  F  x  :  a  Fl-*:1 


X  0  Dom(F)  F  h  fj  type  F[x  :  a]  F  e  I  a 

F  h  Xx:a.e  :  a  ^  a' 

X  0  Dom(F)  F  h  CJ  type  F[x  :  a]  h  e  :  cj 

F  h  Xx:a.e  :  a  ^  a' 


F  h  Cl  :  CJ  ^  cj'  F  h  62  :  cj  F  h  ei  :  cj  ^  cj'  F  h  62  :  cj 


F  h  6162  :  a' 


F  h  6162  :  cj' 


F  h  61  :  cJi  F  h  62  :  CJ2 
F  h  (61,62)  :  cJi  X  CJ2 


F  h  6  :  cJi  X  CJ2  . .  ^ 

rhe.,:,T. 


a  0  Dom(F)  F  h  k  kind  F[a  :  k]  h  e  J,  cj 


F  F  Aa:K.e  :  Va:K. 


CJ 
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The:  \/a:K.a  T  \-  c.  k 
r  h  e[c]  :  (t[c/q;] 

X  0  Dom(r)  r  h  fj  type  r[x  ]  a]\-  e  I  a 
r  h  fix{x:a.e)  :  a 


r  h  e  I  cj 


rhei 


T  \-  e  ■.  a'  T  a  =  a'  type 
T  e  :  a 


T  e  :  a  TheJ, 
r  h  e  I  fJ 


X  :  a  G  r  _ 

rhxi  rh*i 

_  r  h  ei  I  (7i  ^  0-2  r  h  62  i 

r  h  Xx:a.e  j  T  h  6162  i 


r  h  ei  i  r  h  62  i 

r  h  (61,62)  i 


rh6i 

r  h  6.i  i 


1,2) 


a  0  Dom(r)  r  h  K  kind  r[a  ■.  k]\-  e  [  a 
r  h  Aa:K.e  J,  \/a:K.a 


rh6i 

r  h  6[c]  i 


X  0  Dom(r)  r  h  cj  type  r[x  j  a]  h  6  | 
r  h  fix{x:a.e)  | 


A. 2  Structure  Calculus 


r  h  6  :  K 


rhci 


r  h  6  :  CJ 


rh6i 


r  h  M  :  [a:K.cj] 
r  h  Fst  M  :  K 


Th  M  IS 
Th  FstM  i 

r  h  M  :  [a:K.cj] 

Th  SndM:  a[Fst  M/a] 


Th  M  IS 
Th  SndM  i 
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r  h  5  sig 

a  0  Dom(r)  r  h  K  kind  T[a  :  k]\-  a  type 
r  h  [a:K.a]  sig 

r  h  5i  =  52  sig 

T  \-  S  sig  r  h  52  =  5i  sig 

T  S  =  S  sig  T  Si  =  S2  sig 

T  \-  Si  =  S2  sig  r  h  52  =  S3  sig 
r  h  5i  =  53  sig 

a  0  Dom(r)  r  h  K  =  k'  kind  r[Q!  :  k\\-  a  =  a'  type 
r  h  [a-.K.a]  =  [a-.n! .a']  sig 


r  h  5i  <  52  sig 

r  h  5i  =  52  sig 
r  h  5i  <  52  sig 

T  \-  Si  <  S2  sig  r  h  52  <  S3  sig 
r  h  5i  <  53  sig 


a  0  Dom(r)  r  h  K  <  k'  kind  r[a  :  k]\-  a  =  a'  type  r[a  :  k']  h  a'  type 

r  h  [a-.K.a]  <  [a:K' .a']  sig 


r  h  M  :  5 

s:S£T  s]  s  er 
rhs:5  rhs:5 

a  0  Dom(r)  r  h  c  :  K  The:  a\cld\  r[a  -.  k]\-  a  type 

r  h  [c,  e]  :  [a:K.a] 


r  h  M  :  5'  S'  <S  sig  T  h  M  :  [u-.k' .a]  T  ^  Fst  M  -.  k 

r  h  M  :  5  r  h  M  :  [a:K.a] 


Th  M  IS 

s:S  eT 

ThsiS 

a  0  Dom(r)  ri-cJ,K  Thej  a[c/a\  r[Q;  :  k]\-  a  type 

r  h  [c,  e]  i  [a-.K.a] 


r  h  M  i  5'  S'  <S  sig 

rhMi  5 


M  i[a-.K'.a]  T  ^  Fst  M -.  k 
T  \-  M  I  [a-.K.a] 


T\-  Ml  =  M2:  S 


r  h  M  :  5 

T\-  M  =  M  :  S 


T\-  M2  =  Mi:  S 
r  h  Ml  =  M2  :  5 
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T\-  Ml  =  M2:  S  T\-  M2  =  M3:  S 
Th  Ml  =  M3:  S 


a  0  Dom(r)  r  h  c  =  c'  :  K  The:  a[c/a]  r[Q;  :  k]  h  a  type 
r  h  [c,e]  =  [c',e]  :  [a:K.a] 

Ml  =  M2:  S'  S<  S'  sig 

r  h  Ml  =  M2  :  5 

A. 3  Recursive  Module  Calculus 


r  h  M  :  g 

a,s  ^  Dom(r)  F  h  5  =  [cck.cti]  sig 
r[s  t  -S']  h  M  I  [a:K.(j2]  r[a  :  k]\-  ai  =  a2[a/ Fst  s]  type 
r  h  fixQ{s:S.M)  :  S 

s  0  Dom(r)  r[s  |5]I-MJ,5  ri-5  =  [a:5(c  :  K).a]  sig 
r  h  fixrp{s:S.M)  :  S 


Th  M  IS 

r  h  fixQ{s:S.M)  :  S  F  h  fixrp{s:S.M)  :  S 
Thfixo{s:S.M)  i  S  F  h  fixT{s:S.M)  j  S 

Th  Ml  =  M2:  S 


a,s  ^  Dom(F)  F  h  5  =  [aiK.cJi]  sig 
F[s  ]  S]\-  M  I  [a:K.(j2]  F[a  :  k]\-  ai  =  a2[ex/ Fst  s]  type 
F  h  5  =  5'  sig  F[s  T  5]  h  M'  j  [a:K.cJ2]  F[s  :  5]  h  M  =  M'  :  [a:K.a2] 

F  h  fixQ{s:S.M)  =  fixQ{s:S' .M')  :  S 

s  0  Dom(F)  F[s  |5]I-MJ,5  FI-5  =  [a:5(c  :  k).cj]  sig 
Th  S  =  S'  sig  F[s  T  5]  h  M'  i  5  F[s  :  5]  h  M  =  M'  :  5 

F  h  fixrp{s:S.M)  =  fixrp{s:S' .M')  :  S 


a,  s,  s'^,  s'  0  Dom(F)  F  h  k  kind  F[a  :  k]\-  ai  type  F[s'^  :  K][a  :  k]\-  cj2  type 
F[s'^  I  k]  h  c  I  K  F[s‘^  :  /i][s'’  I  (Ti[s‘^/a]]  F  e  J,  cJ2[c/a]  F[a  :  k]  h  cji  =  iT2[a/s'^]  type 
F  h  fix Q{s:[a:n.ai].[c[Fst  s / s"^],  e[Fst  s,  Snd  s'"]])  =  =  pL=s'^:K.c,  fix{s' :ai[s'^ / a\.e)\  :  [a:n.ai] 


a,  s,  s'^,  s'  0  Dom(F)  F  h  S(c  :  k)  kind  F[a  :  S(c  :  k)]  h  a  type 
F[s‘^  t  S(c  :  k)]  c'  [  S(c  :  k)  F[s'^  :  S(c  :  k)][s'’  t  F  e  J,  a[c' /a] 

F  F  fixrp[s:[a:5{c  :  K).a].[c'[Fst  s/s'^],  e[Fst  s,  Snd  s/s'^,  s'"]]) 

=  =  c,  fix{s' :a[s'^ / a].e)]  :  [a:S(c  :  K).a] 
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r  h  5  sig 


a,  0  Dom(r)  F  h  k  kind  r[s'^  :  k]  h  S'  =  [a:K.a]  sig 

r  h  ps^.S  sig 

a,  s'^  0  Dom(r)  F  h  k  kind  F[s'^  :  k]  F  S'  =  [a:5(c  :  k).cj]  sig  F[s'^  ]  k]\-  c  [  k 

F  h  ps^.S  sig 


F  h  5i  =  ^2  sig 


a,  s’^  0  Dom(F)  F  h  k;  kind  F[s'^  :  k]  h  5  =  [a:K.a]  sig  F[s'^  :  k]  S  =  S'  sig 

F  h  ps^.S  =  ps^.S'  sig 

a,  s^  0  Dom(F)  F  h  «;  kind  F[s‘^  |  k]  h  c  J,  k 
F[s'^  :  k]  h  S'  =  [a:S(c  :  k).cj]  sig  F[s‘^  :  k]  S  =  S'  sig 

F  h  psSS  =  psSS'  sig 

a,  s'^  0  Dom(F)  F  h  k  kind  F[s'^  :  «;]  [a  :  k]  h  a  type 
F  h  ps'^ .[a-.K.a]  =  [s'^:K.a[s'^ / a\\  sig 


a,  0  Dom(F)  F  h  k  kind  F[s‘^  ]  k]\-  c  [  n  F[s'^  :  K][a  \  k]\-  a  type 
F  h  ps'^.[a:5{c  :  k).o]  =  [s'^:5{p=s’^:K.c  :  n).a[s'^ /a\\  sig 
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